Research Questions: Can a Hybrid ViT + U-Net model achieve high segmentation accuracy (Dice score) and low boundary error (HD95) when segmenting buildings from satellite imagery across diverse urban areas with varying geographic and urban/architectural characteristics?
Dataset: Chinese GaoFen-7 (GF-7) satellite imagery
This dataset is a high-resolution building segmentation dataset. This GF-7 dataset provides an extensive coverage of urban and rural areas of China by picking 6 typical cities in China (Chen et al.2024). The dataset contains 5175 pairs of 512Ć512 image tiles and 170,015 buildings. Compared to other datasets constructed through satellite and aerial imagery, this dataset has various ground-truth labels for building extraction.
Link: A benchmark GaoFen-7 dataset for building extraction from satellite images
Model : TransUNet
This model is a specialized adaptation of the TransUNet architecture, originally developed by Chen et al. (2021) for medical image segmentation. Here, its application has been applied for building segmentation from satellite imagery.
The Model is a Hybrid ViT + U-Net Model that uses a Vision Transformer as its Encoder and a U-Net Style Model as its Decoder. More specifically the model uses a CNN network (ResNet) and transformer blocks as the encoder and up sampling layers as the decoder to achieve the task. Inspired by a U-Net Structure, Trans-UNet uses the residual network to extract features and do down samplings. The results are then fed into a transformer to encode. Afterwards, up sampling is used to decode the information. The model uses a Pre trained Vision Transformer, specifically in this project, the ResNet-50 Vit-B 16 from Google Research's Vision Transformer implementation is used.
Most of the Code and Architecture for the Model has been directly taken from the TransUNet project by Chen et al. (2021) and then adapted to be used for building Segmentation.
Source: TransUNet: Medical Image Segmentation & TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation¶
Key Changes¶
Overall These are some of the Key Changes I have made to the models Architeture to improve perfomance for Building Segmentation from Satalite Imagery.
The model optimizer was progressively changed from SGD with Momentum ā Adam ā AdamW ā AdamW with AMSGrad. These changes significantly improved training speed, and adding AMSGrad in particular helped enhance convergence
The original use of polynomial learning rate decay led to inconsistent convergence behavior and slower training progress, particularly in the early epochs. To address this, I implemented CosineAnnealingWarmRestarts, which provided smoother and more adaptive learning rate scheduling. This change resulted in faster convergence, reduced training noise, and improved overall stability.
Introduced data augmentation using the Albumentations library. Initially applied a stronger set of transformations, but this led to worsened performance. The augmentations were then progressively simplified, ultimately leaving only mild geometric transforms (horizontal/vertical flips and 90° rotations) to improve model generalization and test performance.
Optimized various training parameters, including batch size, learning rate, number of skip connections, and number of training epochs.
# Loading necessary libraries
import os
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
import random
import h5py
import cv2
import numpy as np
import torch
from PIL import Image
from scipy import ndimage
from scipy.ndimage.interpolation import zoom
from torch.utils.data import Dataset
import albumentations as A
from albumentations.pytorch import ToTensorV2
## Progress bar
from tqdm.notebook import tqdm
# Check for CUDA, then MPS (for Mac), then CPU
if torch.cuda.is_available():
device = torch.device("cuda")
elif hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
device = torch.device("mps")
else:
device = torch.device("cpu")
print("Using device:", device)
C:\Users\irti2\AppData\Local\Temp\ipykernel_29636\2787216946.py:11: DeprecationWarning: Please import `zoom` from the `scipy.ndimage` namespace; the `scipy.ndimage.interpolation` namespace is deprecated and will be removed in SciPy 2.0.0. from scipy.ndimage.interpolation import zoom
Using device: cuda
c:\Users\irti2\miniconda3\envs\pytorch\Lib\site-packages\albumentations\check_version.py:147: UserWarning: Error fetching version info <urlopen error [Errno 11001] getaddrinfo failed> data = fetch_version_info()
def random_rot_flip(image, label):
k = np.random.randint(0, 4)
image = np.rot90(image, k)
label = np.rot90(label, k)
axis = np.random.randint(0, 2)
image = np.flip(image, axis=axis).copy()
label = np.flip(label, axis=axis).copy()
return image, label
def random_rotate(image, label):
angle = np.random.randint(-20, 20)
image = ndimage.rotate(image, angle, order=0, reshape=False)
label = ndimage.rotate(label, angle, order=0, reshape=False)
return image, label
class RandomGenerator(object):
def __init__(self, output_size):
self.output_size = output_size
def __call__(self, sample):
image, label = sample['image'], sample['label']
if random.random() > 0.5:
image, label = random_rot_flip(image, label)
elif random.random() > 0.5:
image, label = random_rotate(image, label)
x, y = image.shape
if x != self.output_size[0] or y != self.output_size[1]:
image = zoom(image, (self.output_size[0] / x, self.output_size[1] / y), order=3)
label = zoom(label, (self.output_size[0] / x, self.output_size[1] / y), order=0)
image = torch.from_numpy(image.astype(np.float32)).unsqueeze(0)
label = torch.from_numpy(label.astype(np.float32))
sample = {'image': image, 'label': label.long()}
return sample
Data Loader Class¶
class GF7Dataset(Dataset):
def __init__(self, image_dir, mask_dir, image_size=224, transform=None):
self.image_paths = sorted([
os.path.join(image_dir, f) for f in os.listdir(image_dir)
if f.lower().endswith(('.png', '.jpg', '.jpeg', '.tif', '.tiff'))
])
self.mask_paths = sorted([
os.path.join(mask_dir, f) for f in os.listdir(mask_dir)
if f.lower().endswith(('.png', '.jpg', '.jpeg', '.tif', '.tiff'))
])
self.image_size = image_size
self.transform = transform
def __len__(self):
return len(self.image_paths)
def __getitem__(self, idx):
image_path = self.image_paths[idx]
mask_path = self.mask_paths[idx]
image = cv2.imread(image_path)
if image is None:
raise FileNotFoundError(f"Could not read image: {image_path}")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert BGR to RGB
image = cv2.resize(image, (self.image_size, self.image_size)) # Resize to target size
image = image.astype(np.float32) / 255.0 # Normalize to [0, 1]
mask = cv2.imread(mask_path, cv2.IMREAD_GRAYSCALE)
if mask is None:
raise FileNotFoundError(f"Could not read mask: {mask_path}")
mask = cv2.resize(mask, (self.image_size, self.image_size)) # Resize to target size
mask = (mask > 127).astype(np.float32) # Binarize
if self.transform:
augmented = self.transform(image=image, mask=mask)
image = augmented['image']
mask = augmented['mask'] # Already a tensor with shape [H, W] or [1, H, W]
# If a transform is not provided, or if the probabilistic transform
# was skipped, the data will still be numpy arrays.
# This check ensures they are always converted to tensors.
if not isinstance(image, torch.Tensor):
# Apply ImageNet normalization and convert to a tensor
image = (image - np.array([0.485, 0.456, 0.406])) / np.array([0.229, 0.224, 0.225]) # Standardize to ImageNet
image = torch.from_numpy(image.transpose(2, 0, 1)).float() # Convert to [C, H, W]
if not isinstance(mask, torch.Tensor):
mask = torch.from_numpy(mask).float()
# Ensure mask has a channel dimension
if mask.ndim == 2:
mask = mask.unsqueeze(0)
return image, mask
Testings to See if Dataloader is Working Correctly¶
# Test to see if the dataset works
from torch.utils.data import DataLoader
dataset = GF7Dataset(
image_dir = "data/GF-7 Building (3Bands)/Train/image",
mask_dir = "data/GF-7 Building (3Bands)/Train/label",
image_size=224
)
loader = DataLoader(dataset, batch_size=8, shuffle=True)
# Iterate over a batch
for images, masks in loader:
print(images.shape) # [B, 3, 224, 224]
print(masks.shape) # [B, 1, 224, 224]
break
print(f"Number of samples in dataset: {len(dataset)}")
torch.Size([8, 3, 224, 224]) torch.Size([8, 1, 224, 224]) Number of samples in dataset: 3106
Tranformations¶
Heavy Tranformation Pipeline¶
prob_pipeline = 1
tranform_pipline = A.Compose([
A.RandomRotate90(p=0.5),
A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.5),
A.SomeOf([
A.RandomBrightnessContrast(brightness_limit=0.1, contrast_limit=0.1, p=1.0),
A.RGBShift(r_shift_limit=8, g_shift_limit=8, b_shift_limit=8, p=1.0),
A.HueSaturationValue(hue_shift_limit=5, sat_shift_limit=8, val_shift_limit=5, p=1.0),
A.RandomGamma(gamma_limit=(90, 110), p=1.0),
], n=2, p=0.8), # Change to 1
# # Cloud Cover Simulation
# A.SomeOf([
# A.RandomFog(fog_coef_range=(0.1, 0.2), alpha_coef=0.08, p=1.0),
# A.RandomShadow(p=1.0),
# ], n=1, p=0.5),
# Noise
A.OneOf([
A.MultiplicativeNoise(multiplier=(0.7, 1.2), per_channel=True,elementwise=True, p=1.0),
# A.GaussNoise(p=1.0), Adding Too much noise
], p=0.4),
#A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), # Standardize to ImageNet
#ToTensorV2(),
], p=prob_pipeline, seed=42)
Lite Transformation Pipeline¶
prob_pipeline = 1
lite_tranform_pipeline = A.Compose([
# Light spatial transforms
A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.5),
A.RandomRotate90(p=0.5), # Lower rotation probability
# One mild color transform per sample
A.OneOf([
A.RandomBrightnessContrast(brightness_limit=0.05, contrast_limit=0.05, p=1.0),
A.HueSaturationValue(hue_shift_limit=3, sat_shift_limit=5, val_shift_limit=3, p=1.0),
A.RGBShift(r_shift_limit=3, g_shift_limit=3, b_shift_limit=3, p=1.0),
A.RandomGamma(gamma_limit=(98, 102), p=1.0),
], p=0.3), # Lighter color shift and reduced prob
# Light noise
A.MultiplicativeNoise(multiplier=(0.9, 1.1), per_channel=True, elementwise=False, p=0.2),
# Normalize + tensor
#A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], p=1.0),
#ToTensorV2(p=1.0),
], p=prob_pipeline, seed=42)
XL Lite Transform Pipeline (Currently Used)¶
prob_pipeline = 1
XL_lite_tranform_pipeline = A.Compose([
# Light spatial transforms
A.HorizontalFlip(p=0.5),
A.VerticalFlip(p=0.5),
A.RandomRotate90(p=0.5), # Lower rotation probability
# # One mild color transform per sample
# A.OneOf([
# A.RandomBrightnessContrast(brightness_limit=0.05, contrast_limit=0.05, p=1.0),
# A.HueSaturationValue(hue_shift_limit=3, sat_shift_limit=5, val_shift_limit=3, p=1.0),
# A.RGBShift(r_shift_limit=3, g_shift_limit=3, b_shift_limit=3, p=1.0),
# A.RandomGamma(gamma_limit=(98, 102), p=1.0),
# ], p=0.3), # Lighter color shift and reduced prob
# # Light noise
# A.MultiplicativeNoise(multiplier=(0.9, 1.1), per_channel=True, elementwise=False, p=0.2),
# Normalize + tensor
#A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], p=1.0),
#ToTensorV2(p=1.0),
], p=prob_pipeline, seed=42)
Testing and Checking Augmentations¶
## Only Used to Check Augmentation
# dataset = GF7Dataset(
# image_dir = "data/GF-7 Building (3Bands)/Train/image",
# mask_dir = "data/GF-7 Building (3Bands)/Train/label",
# image_size=224,
# transform=lite_tranform_pipeline
# )
# loader = DataLoader(dataset, batch_size=8, shuffle=True)
# # Iterate over a batch
# for images, masks in loader:
# print(images.shape) # [B, 3, 224, 224]
# print(masks.shape) # [B, 1, 224, 224]
# break
# print(f"Number of samples in dataset: {len(dataset)}")
# import matplotlib.pyplot as plt
# import numpy as np
# # Get a batch from your loader
# for images, masks in loader:
# for i in range(min(4, images.shape[0])):
# img = images[i].cpu().numpy()
# mask = masks[i].cpu().numpy().squeeze(0)
# img = np.transpose(img, (1, 2, 0))
# print(f"Original - min: {img.min():.4f}, max: {img.max():.4f}, mean: {img.mean():.4f}")
# # More robust check for normalization
# if abs(img.mean()) > 0.5 or img.min() < -0.1: # Likely normalized
# print("Detected normalized image, unnormalizing...")
# mean = np.array([0.485, 0.456, 0.406])
# std = np.array([0.229, 0.224, 0.225])
# img = (img * std) + mean
# print(f"After unnorm - min: {img.min():.4f}, max: {img.max():.4f}")
# img = np.clip(img, 0, 1)
# plt.figure(figsize=(6, 3))
# plt.subplot(1, 2, 1)
# plt.imshow(img)
# plt.title("Image")
# plt.axis('off')
# plt.subplot(1, 2, 2)
# plt.imshow(mask, cmap='gray')
# plt.title("Mask")
# plt.axis('off')
# plt.show()
# break
ResNet¶
def get_r50_b16_config():
"""Returns the Resnet50 + ViT-B/16 configuration."""
config = get_b16_config()
config.patches.grid = (16, 16)
config.resnet = ml_collections.ConfigDict()
config.resnet.num_layers = (3, 4, 9)
config.resnet.width_factor = 1
config.classifier = 'seg'
config.pretrained_path = '../model/ViT-B_16.npz'
config.decoder_channels = (256, 128, 64, 16)
config.skip_channels = [512, 256, 64, 16]
config.n_classes = 2
config.n_skip = 3
config.activation = 'softmax'
return config
from os.path import join as pjoin
from collections import OrderedDict
import torch
import torch.nn as nn
import torch.nn.functional as F
def np2th(weights, conv=False):
"""Possibly convert HWIO to OIHW."""
if conv:
weights = weights.transpose([3, 2, 0, 1])
return torch.from_numpy(weights)
#standardize the weights before doing convolution
#weight -->(output_channel,input_channel,kernel_size[0],kernel_size[1])
#So compute mean and variance for each input_channel*kernel_size[0]*kernel_size[1]
class StdConv2d(nn.Conv2d):
def forward(self, x):
w = self.weight
v, m = torch.var_mean(w, dim=[1, 2, 3], keepdim=True, unbiased=False)
w = (w - m) / torch.sqrt(v + 1e-5)
return F.conv2d(x, w, self.bias, self.stride, self.padding,
self.dilation, self.groups)
#do convolution using StdConv2d
def conv3x3(cin, cout, stride=1, groups=1, bias=False):
return StdConv2d(cin, cout, kernel_size=3, stride=stride,
padding=1, bias=bias, groups=groups)
def conv1x1(cin, cout, stride=1, bias=False):
return StdConv2d(cin, cout, kernel_size=1, stride=stride,
padding=0, bias=bias)
Pre-Activation Bottleneck Block¶
class PreActBottleneck(nn.Module):
"""Pre-activation (v2) bottleneck block.
"""
def __init__(self, cin, cout=None, cmid=None, stride=1):
super().__init__()
cout = cout or cin
cmid = cmid or cout//4
self.gn1 = nn.GroupNorm(32, cmid, eps=1e-6)
self.conv1 = conv1x1(cin, cmid, bias=False)
self.gn2 = nn.GroupNorm(32, cmid, eps=1e-6)
self.conv2 = conv3x3(cmid, cmid, stride, bias=False) # Original code has it on conv1!!
self.gn3 = nn.GroupNorm(32, cout, eps=1e-6)
self.conv3 = conv1x1(cmid, cout, bias=False)
self.relu = nn.ReLU(inplace=True)
if (stride != 1 or cin != cout):
# Projection also with pre-activation according to paper.
self.downsample = conv1x1(cin, cout, stride, bias=False)
self.gn_proj = nn.GroupNorm(cout, cout)
def forward(self, x):
# Residual branch
residual = x
if hasattr(self, 'downsample'):
residual = self.downsample(x)
residual = self.gn_proj(residual)
# Unit's branch
y = self.relu(self.gn1(self.conv1(x)))
y = self.relu(self.gn2(self.conv2(y)))
y = self.gn3(self.conv3(y))
y = self.relu(residual + y)
return y
def load_from(self, weights, n_block, n_unit):
conv1_weight = np2th(weights[pjoin(n_block, n_unit, "conv1/kernel").replace('\\', '/')], conv=True)
conv2_weight = np2th(weights[pjoin(n_block, n_unit, "conv2/kernel").replace('\\', '/')], conv=True)
conv3_weight = np2th(weights[pjoin(n_block, n_unit, "conv3/kernel").replace('\\', '/')], conv=True)
gn1_weight = np2th(weights[pjoin(n_block, n_unit, "gn1/scale").replace('\\', '/')])
gn1_bias = np2th(weights[pjoin(n_block, n_unit, "gn1/bias").replace('\\', '/')])
gn2_weight = np2th(weights[pjoin(n_block, n_unit, "gn2/scale").replace('\\', '/')])
gn2_bias = np2th(weights[pjoin(n_block, n_unit, "gn2/bias").replace('\\', '/')])
gn3_weight = np2th(weights[pjoin(n_block, n_unit, "gn3/scale").replace('\\', '/')])
gn3_bias = np2th(weights[pjoin(n_block, n_unit, "gn3/bias").replace('\\', '/')])
self.conv1.weight.copy_(conv1_weight)
self.conv2.weight.copy_(conv2_weight)
self.conv3.weight.copy_(conv3_weight)
self.gn1.weight.copy_(gn1_weight.view(-1))
self.gn1.bias.copy_(gn1_bias.view(-1))
self.gn2.weight.copy_(gn2_weight.view(-1))
self.gn2.bias.copy_(gn2_bias.view(-1))
self.gn3.weight.copy_(gn3_weight.view(-1))
self.gn3.bias.copy_(gn3_bias.view(-1))
if hasattr(self, 'downsample'):
proj_conv_weight = np2th(weights[pjoin(n_block, n_unit, "conv_proj/kernel").replace('\\', '/')], conv=True)
proj_gn_weight = np2th(weights[pjoin(n_block, n_unit, "gn_proj/scale").replace('\\', '/')])
proj_gn_bias = np2th(weights[pjoin(n_block, n_unit, "gn_proj/bias").replace('\\', '/')])
self.downsample.weight.copy_(proj_conv_weight)
self.gn_proj.weight.copy_(proj_gn_weight.view(-1))
self.gn_proj.bias.copy_(proj_gn_bias.view(-1))
Resnet V2¶
class ResNetV2(nn.Module):
"""Implementation of Pre-activation (v2) ResNet mode."""
def __init__(self, block_units, width_factor):
super().__init__()
width = int(64 * width_factor)
self.width = width
self.root = nn.Sequential(OrderedDict([
('conv', StdConv2d(3, width, kernel_size=7, stride=2, bias=False, padding=3)),
('gn', nn.GroupNorm(32, width, eps=1e-6)),
('relu', nn.ReLU(inplace=True)),
# ('pool', nn.MaxPool2d(kernel_size=3, stride=2, padding=0))
]))
self.body = nn.Sequential(OrderedDict([
('block1', nn.Sequential(OrderedDict(
[('unit1', PreActBottleneck(cin=width, cout=width*4, cmid=width))] +
[(f'unit{i:d}', PreActBottleneck(cin=width*4, cout=width*4, cmid=width)) for i in range(2, block_units[0] + 1)],
))),
('block2', nn.Sequential(OrderedDict(
[('unit1', PreActBottleneck(cin=width*4, cout=width*8, cmid=width*2, stride=2))] +
[(f'unit{i:d}', PreActBottleneck(cin=width*8, cout=width*8, cmid=width*2)) for i in range(2, block_units[1] + 1)],
))),
('block3', nn.Sequential(OrderedDict(
[('unit1', PreActBottleneck(cin=width*8, cout=width*16, cmid=width*4, stride=2))] +
[(f'unit{i:d}', PreActBottleneck(cin=width*16, cout=width*16, cmid=width*4)) for i in range(2, block_units[2] + 1)],
))),
]))
def forward(self, x):
features = []
b, c, in_size, _ = x.size()
x = self.root(x)
features.append(x)
x = nn.MaxPool2d(kernel_size=3, stride=2, padding=0)(x)
for i in range(len(self.body)-1):
#According to paper, you have to concatenate the the output of resnet
#blocks with decoder part so you have to make sure that the height and
#width matches
x = self.body[i](x)
right_size = int(in_size / 4 / (i+1))
if x.size()[2] != right_size:
pad = right_size - x.size()[2]
assert pad < 3 and pad > 0, "x {} should {}".format(x.size(), right_size)
feat = torch.zeros((b, x.size()[1], right_size, right_size), device=x.device)
feat[:, :, 0:x.size()[2], 0:x.size()[3]] = x[:]
else:
feat = x
features.append(feat)
x = self.body[-1](x)
return x, features[::-1]
# coding=utf-8
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import copy
import logging
import math
import ml_collections
from os.path import join as pjoin
import torch
import torch.nn as nn
import numpy as np
from torch.nn import CrossEntropyLoss, Dropout, Softmax, Linear, Conv2d, LayerNorm
from torch.nn.modules.utils import _pair
from scipy import ndimage
logger = logging.getLogger(__name__)
def get_b16_config():
"""Returns the ViT-B/16 configuration."""
config = ml_collections.ConfigDict()
config.patches = ml_collections.ConfigDict({'size': (16, 16)})
config.hidden_size = 768
config.transformer = ml_collections.ConfigDict()
config.transformer.mlp_dim = 3072
config.transformer.num_heads = 12
config.transformer.num_layers = 12
config.transformer.attention_dropout_rate = 0.0
config.transformer.dropout_rate = 0.1
config.classifier = 'seg'
config.representation_size = None
config.resnet_pretrained_path = None
config.pretrained_path = '../model/vit_checkpoint/imagenet21k/ViT-B_16.npz'
config.patch_size = 16
config.decoder_channels = (256, 128, 64, 16)
config.n_classes = 2
config.activation = 'softmax'
return config
def get_r50_b16_config():
"""Returns the Resnet50 + ViT-B/16 configuration."""
config = get_b16_config()
config.patches.grid = (16, 16)
config.resnet = ml_collections.ConfigDict()
config.resnet.num_layers = (3, 4, 9)
config.resnet.width_factor = 1
config.classifier = 'seg'
config.pretrained_path = '../model/vit_checkpoint/imagenet21k/R50+ViT-B_16.npz'
config.decoder_channels = (256, 128, 64, 16)
config.skip_channels = [512, 256, 64, 16]
config.n_classes = 2
config.n_skip = 3
config.activation = 'softmax'
return config
CONFIGS = {
'ViT-B_16': get_b16_config(),
'R50-ViT-B_16': get_r50_b16_config(),
}
Vision Transformer (ViT)¶
Embeding Layer¶
class Embeddings(nn.Module):
"""Construct the embeddings from patch, position embeddings.
"""
def __init__(self, config, img_size, in_channels=3):
super(Embeddings, self).__init__()
self.hybrid = None
self.config = config
img_size = _pair(img_size)
#print(config.patches.get("grid"))
#print(img_size)
if config.patches.get("grid") is not None: # ResNet
grid_size = config.patches["grid"]
#print(grid_size)
patch_size = (img_size[0] // 16 // grid_size[0], img_size[1] // 16 // grid_size[1])
patch_size_real = (patch_size[0] * 16, patch_size[1] * 16)
#print(patch_size,patch_size_real)
n_patches = (img_size[0] // patch_size_real[0]) * (img_size[1] // patch_size_real[1])
self.hybrid = True
else:
patch_size = _pair(config.patches["size"])
n_patches = (img_size[0] // patch_size[0]) * (img_size[1] // patch_size[1])
self.hybrid = False
if self.hybrid:
self.hybrid_model = ResNetV2(block_units=config.resnet.num_layers, width_factor=config.resnet.width_factor)
in_channels = self.hybrid_model.width * 16
self.patch_embeddings = Conv2d(in_channels=in_channels,
out_channels=config.hidden_size,
kernel_size=patch_size,
stride=patch_size)
self.position_embeddings = nn.Parameter(torch.zeros(1, n_patches, config.hidden_size))
self.dropout = Dropout(config.transformer["dropout_rate"])
def forward(self, x):
if self.hybrid:
x, features = self.hybrid_model(x)
else:
features = None
x = self.patch_embeddings(x) # (B, hidden. n_patches^(1/2), n_patches^(1/2))
x = x.flatten(2)
x = x.transpose(-1, -2) # (B, n_patches, hidden)
embeddings = x + self.position_embeddings
embeddings = self.dropout(embeddings)
return embeddings, features
Attention¶
ATTENTION_Q = "MultiHeadDotProductAttention_1/query"
ATTENTION_K = "MultiHeadDotProductAttention_1/key"
ATTENTION_V = "MultiHeadDotProductAttention_1/value"
ATTENTION_OUT = "MultiHeadDotProductAttention_1/out"
FC_0 = "MlpBlock_3/Dense_0"
FC_1 = "MlpBlock_3/Dense_1"
ATTENTION_NORM = "LayerNorm_0"
MLP_NORM = "LayerNorm_2"
def np2th(weights, conv=False):
"""Possibly convert HWIO to OIHW."""
if conv:
weights = weights.transpose([3, 2, 0, 1])
return torch.from_numpy(weights)
def swish(x):
return x * torch.sigmoid(x)
ACT2FN = {"gelu": torch.nn.functional.gelu, "relu": torch.nn.functional.relu, "swish": swish}
class Attention(nn.Module):
def __init__(self, config, vis):
super(Attention, self).__init__()
self.vis = vis
self.num_attention_heads = config.transformer["num_heads"]
self.attention_head_size = int(config.hidden_size / self.num_attention_heads)
self.all_head_size = self.num_attention_heads * self.attention_head_size
self.query = Linear(config.hidden_size, self.all_head_size)
self.key = Linear(config.hidden_size, self.all_head_size)
self.value = Linear(config.hidden_size, self.all_head_size)
self.out = Linear(config.hidden_size, config.hidden_size)
self.attn_dropout = Dropout(config.transformer["attention_dropout_rate"])
self.proj_dropout = Dropout(config.transformer["attention_dropout_rate"])
self.softmax = Softmax(dim=-1)
def transpose_for_scores(self, x):
new_x_shape = x.size()[:-1] + (self.num_attention_heads, self.attention_head_size)
x = x.view(*new_x_shape)
return x.permute(0, 2, 1, 3)
def forward(self, hidden_states):
mixed_query_layer = self.query(hidden_states)
mixed_key_layer = self.key(hidden_states)
mixed_value_layer = self.value(hidden_states)
query_layer = self.transpose_for_scores(mixed_query_layer)
key_layer = self.transpose_for_scores(mixed_key_layer)
value_layer = self.transpose_for_scores(mixed_value_layer)
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
attention_scores = attention_scores / math.sqrt(self.attention_head_size)
attention_probs = self.softmax(attention_scores)
weights = attention_probs if self.vis else None
attention_probs = self.attn_dropout(attention_probs)
context_layer = torch.matmul(attention_probs, value_layer)
context_layer = context_layer.permute(0, 2, 1, 3).contiguous()
new_context_layer_shape = context_layer.size()[:-2] + (self.all_head_size,)
context_layer = context_layer.view(*new_context_layer_shape)
attention_output = self.out(context_layer)
attention_output = self.proj_dropout(attention_output)
return attention_output, weights
Multilayer Perceptron¶
class Mlp(nn.Module):
def __init__(self, config):
super(Mlp, self).__init__()
self.fc1 = Linear(config.hidden_size, config.transformer["mlp_dim"])
self.fc2 = Linear(config.transformer["mlp_dim"], config.hidden_size)
self.act_fn = ACT2FN["gelu"]
self.dropout = Dropout(config.transformer["dropout_rate"])
self._init_weights()
def _init_weights(self):
nn.init.xavier_uniform_(self.fc1.weight)
nn.init.xavier_uniform_(self.fc2.weight)
nn.init.normal_(self.fc1.bias, std=1e-6)
nn.init.normal_(self.fc2.bias, std=1e-6)
def forward(self, x):
x = self.fc1(x)
x = self.act_fn(x)
x = self.dropout(x)
x = self.fc2(x)
x = self.dropout(x)
return x
Transformer block¶
class Block(nn.Module):
def __init__(self, config, vis):
super(Block, self).__init__()
self.hidden_size = config.hidden_size
self.attention_norm = LayerNorm(config.hidden_size, eps=1e-6)
self.ffn_norm = LayerNorm(config.hidden_size, eps=1e-6)
self.ffn = Mlp(config)
self.attn = Attention(config, vis)
def forward(self, x):
h = x
x = self.attention_norm(x)
x, weights = self.attn(x)
x = x + h
h = x
x = self.ffn_norm(x)
x = self.ffn(x)
x = x + h
return x, weights
def load_from(self, weights, n_block):
ROOT = f"Transformer/encoderblock_{n_block}"
with torch.no_grad():
Temp=weights
query_weight = np2th(weights[pjoin(ROOT,ATTENTION_Q,"kernel").replace('\\', '/')]).view(self.hidden_size, self.hidden_size).t()
key_weight = np2th(weights[pjoin(ROOT, ATTENTION_K, "kernel").replace('\\', '/')]).view(self.hidden_size, self.hidden_size).t()
value_weight = np2th(weights[pjoin(ROOT, ATTENTION_V, "kernel").replace('\\', '/')]).view(self.hidden_size, self.hidden_size).t()
out_weight = np2th(weights[pjoin(ROOT, ATTENTION_OUT, "kernel").replace('\\', '/')]).view(self.hidden_size, self.hidden_size).t()
query_bias = np2th(weights[pjoin(ROOT, ATTENTION_Q, "bias").replace('\\', '/')]).view(-1)
key_bias = np2th(weights[pjoin(ROOT, ATTENTION_K, "bias").replace('\\', '/')]).view(-1)
value_bias = np2th(weights[pjoin(ROOT, ATTENTION_V, "bias").replace('\\', '/')]).view(-1)
out_bias = np2th(weights[pjoin(ROOT, ATTENTION_OUT, "bias").replace('\\', '/')]).view(-1)
self.attn.query.weight.copy_(query_weight)
self.attn.key.weight.copy_(key_weight)
self.attn.value.weight.copy_(value_weight)
self.attn.out.weight.copy_(out_weight)
self.attn.query.bias.copy_(query_bias)
self.attn.key.bias.copy_(key_bias)
self.attn.value.bias.copy_(value_bias)
self.attn.out.bias.copy_(out_bias)
mlp_weight_0 = np2th(weights[pjoin(ROOT, FC_0, "kernel").replace('\\', '/')]).t()
mlp_weight_1 = np2th(weights[pjoin(ROOT, FC_1, "kernel").replace('\\', '/')]).t()
mlp_bias_0 = np2th(weights[pjoin(ROOT, FC_0, "bias").replace('\\', '/')]).t()
mlp_bias_1 = np2th(weights[pjoin(ROOT, FC_1, "bias").replace('\\', '/')]).t()
self.ffn.fc1.weight.copy_(mlp_weight_0)
self.ffn.fc2.weight.copy_(mlp_weight_1)
self.ffn.fc1.bias.copy_(mlp_bias_0)
self.ffn.fc2.bias.copy_(mlp_bias_1)
self.attention_norm.weight.copy_(np2th(weights[pjoin(ROOT, ATTENTION_NORM, "scale").replace('\\', '/')]))
self.attention_norm.bias.copy_(np2th(weights[pjoin(ROOT, ATTENTION_NORM, "bias").replace('\\', '/')]))
self.ffn_norm.weight.copy_(np2th(weights[pjoin(ROOT, MLP_NORM, "scale").replace('\\', '/')]))
self.ffn_norm.bias.copy_(np2th(weights[pjoin(ROOT, MLP_NORM, "bias").replace('\\', '/')]))
Encoder¶
class Encoder(nn.Module):
def __init__(self, config, vis):
super(Encoder, self).__init__()
self.vis = vis
self.layer = nn.ModuleList()
self.encoder_norm = LayerNorm(config.hidden_size, eps=1e-6)
for _ in range(config.transformer["num_layers"]):
layer = Block(config, vis)
self.layer.append(copy.deepcopy(layer))
def forward(self, hidden_states):
attn_weights = []
for layer_block in self.layer:
hidden_states, weights = layer_block(hidden_states)
if self.vis:
attn_weights.append(weights)
encoded = self.encoder_norm(hidden_states)
return encoded, attn_weights
Transformer¶
class Transformer(nn.Module):
def __init__(self, config, img_size, vis):
super(Transformer, self).__init__()
self.embeddings = Embeddings(config, img_size=img_size)
self.encoder = Encoder(config, vis)
def forward(self, input_ids):
embedding_output, features = self.embeddings(input_ids)
encoded, attn_weights = self.encoder(embedding_output) # (B, n_patch, hidden)
return encoded, attn_weights, features
Decoder¶
class Conv2dReLU(nn.Sequential):
def __init__(
self,
in_channels,
out_channels,
kernel_size,
padding=0,
stride=1,
use_batchnorm=True,
):
conv = nn.Conv2d(
in_channels,
out_channels,
kernel_size,
stride=stride,
padding=padding,
bias=not (use_batchnorm),
)
relu = nn.ReLU(inplace=True)
bn = nn.BatchNorm2d(out_channels)
super(Conv2dReLU, self).__init__(conv, bn, relu)
class DecoderBlock(nn.Module):
def __init__(
self,
in_channels,
out_channels,
skip_channels=0,
use_batchnorm=True,
):
super().__init__()
self.conv1 = Conv2dReLU(
in_channels + skip_channels,
out_channels,
kernel_size=3,
padding=1,
use_batchnorm=use_batchnorm,
)
self.conv2 = Conv2dReLU(
out_channels,
out_channels,
kernel_size=3,
padding=1,
use_batchnorm=use_batchnorm,
)
self.up = nn.UpsamplingBilinear2d(scale_factor=2)
def forward(self, x, skip=None):
x = self.up(x)
if skip is not None:
x = torch.cat([x, skip], dim=1)
x = self.conv1(x)
x = self.conv2(x)
return x
class DecoderCup(nn.Module):
def __init__(self, config):
super().__init__()
self.config = config
head_channels = 512
self.conv_more = Conv2dReLU(
config.hidden_size,
head_channels,
kernel_size=3,
padding=1,
use_batchnorm=True,
)
decoder_channels = config.decoder_channels
in_channels = [head_channels] + list(decoder_channels[:-1])
out_channels = decoder_channels
if self.config.n_skip != 0:
skip_channels = self.config.skip_channels
for i in range(4-self.config.n_skip): # re-select the skip channels according to n_skip
skip_channels[3-i]=0
else:
skip_channels=[0,0,0,0]
blocks = [
DecoderBlock(in_ch, out_ch, sk_ch) for in_ch, out_ch, sk_ch in zip(in_channels, out_channels, skip_channels)
]
self.blocks = nn.ModuleList(blocks)
def forward(self, hidden_states, features=None):
B, n_patch, hidden = hidden_states.size() # reshape from (B, n_patch, hidden) to (B, h, w, hidden)
h, w = int(np.sqrt(n_patch)), int(np.sqrt(n_patch))
x = hidden_states.permute(0, 2, 1)
x = x.contiguous().view(B, hidden, h, w)
x = self.conv_more(x)
for i, decoder_block in enumerate(self.blocks):
if features is not None:
skip = features[i] if (i < self.config.n_skip) else None
else:
skip = None
x = decoder_block(x, skip=skip)
return x
Segmentation Head¶
class SegmentationHead(nn.Sequential):
def __init__(self, in_channels, out_channels, kernel_size=3, upsampling=1):
conv2d = nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, padding=kernel_size // 2)
upsampling = nn.UpsamplingBilinear2d(scale_factor=upsampling) if upsampling > 1 else nn.Identity()
super().__init__(conv2d, upsampling)
Vision Tranformer Class¶
class VisionTransformer(nn.Module):
def __init__(self, config, img_size=224, num_classes=21843, zero_head=False, vis=False):
super(VisionTransformer, self).__init__()
self.num_classes = num_classes
self.zero_head = zero_head
self.classifier = config.classifier
self.transformer = Transformer(config, img_size, vis)
self.decoder = DecoderCup(config)
self.segmentation_head = SegmentationHead(
in_channels=config['decoder_channels'][-1],
out_channels=config['n_classes'],
kernel_size=3,
)
self.config = config
def forward(self, x):
if x.size()[1] == 1:
x = x.repeat(1,3,1,1)
x, attn_weights, features = self.transformer(x) # (B, n_patch, hidden)
x = self.decoder(x, features)
logits = self.segmentation_head(x)
return logits
def load_from(self, weights):
with torch.no_grad():
res_weight = weights
self.transformer.embeddings.patch_embeddings.weight.copy_(np2th(weights["embedding/kernel"], conv=True))
self.transformer.embeddings.patch_embeddings.bias.copy_(np2th(weights["embedding/bias"]))
self.transformer.encoder.encoder_norm.weight.copy_(np2th(weights["Transformer/encoder_norm/scale"]))
self.transformer.encoder.encoder_norm.bias.copy_(np2th(weights["Transformer/encoder_norm/bias"]))
posemb = np2th(weights["Transformer/posembed_input/pos_embedding"])
posemb_new = self.transformer.embeddings.position_embeddings
if posemb.size() == posemb_new.size():
self.transformer.embeddings.position_embeddings.copy_(posemb)
elif posemb.size()[1]-1 == posemb_new.size()[1]:
posemb = posemb[:, 1:]
self.transformer.embeddings.position_embeddings.copy_(posemb)
else:
logger.info("load_pretrained: resized variant: %s to %s" % (posemb.size(), posemb_new.size()))
ntok_new = posemb_new.size(1)
if self.classifier == "seg":
_, posemb_grid = posemb[:, :1], posemb[0, 1:]
gs_old = int(np.sqrt(len(posemb_grid)))
gs_new = int(np.sqrt(ntok_new))
print('load_pretrained: grid-size from %s to %s' % (gs_old, gs_new))
posemb_grid = posemb_grid.reshape(gs_old, gs_old, -1)
zoom = (gs_new / gs_old, gs_new / gs_old, 1)
posemb_grid = ndimage.zoom(posemb_grid, zoom, order=1) # th2np
posemb_grid = posemb_grid.reshape(1, gs_new * gs_new, -1)
posemb = posemb_grid
self.transformer.embeddings.position_embeddings.copy_(np2th(posemb))
# Encoder whole
for bname, block in self.transformer.encoder.named_children():
for uname, unit in block.named_children():
unit.load_from(weights, n_block=uname)
if self.transformer.embeddings.hybrid:
self.transformer.embeddings.hybrid_model.root.conv.weight.copy_(np2th(res_weight["conv_root/kernel"], conv=True))
gn_weight = np2th(res_weight["gn_root/scale"]).view(-1)
gn_bias = np2th(res_weight["gn_root/bias"]).view(-1)
self.transformer.embeddings.hybrid_model.root.gn.weight.copy_(gn_weight)
self.transformer.embeddings.hybrid_model.root.gn.bias.copy_(gn_bias)
for bname, block in self.transformer.embeddings.hybrid_model.body.named_children():
for uname, unit in block.named_children():
unit.load_from(res_weight, n_block=bname, n_unit=uname)
Loss Function¶
class DiceLoss(nn.Module):
def __init__(self, n_classes):
super(DiceLoss, self).__init__()
self.n_classes = n_classes
def _one_hot_encoder(self, input_tensor):
tensor_list = []
for i in range(self.n_classes):
temp_prob = input_tensor == i # * torch.ones_like(input_tensor)
tensor_list.append(temp_prob.unsqueeze(1))
output_tensor = torch.cat(tensor_list, dim=1)
return output_tensor.float()
def _dice_loss(self, score, target):
target = target.float()
smooth = 1e-5
intersect = torch.sum(score * target)
y_sum = torch.sum(target * target)
z_sum = torch.sum(score * score)
loss = (2 * intersect + smooth) / (z_sum + y_sum + smooth)
loss = 1 - loss
return loss
def forward(self, inputs, target, weight=None, softmax=False):
if softmax:
inputs = torch.softmax(inputs, dim=1)
target = self._one_hot_encoder(target)
if weight is None:
weight = [1] * self.n_classes
assert inputs.size() == target.size(), 'predict {} & target {} shape do not match'.format(inputs.size(), target.size())
class_wise_dice = []
loss = 0.0
for i in range(0, self.n_classes):
dice = self._dice_loss(inputs[:, i], target[:, i])
class_wise_dice.append(1.0 - dice.item())
loss += dice * weight[i]
return loss / self.n_classes
Training¶
# Libaries Used for Training
import argparse
import logging
import os
import random
import sys
import time
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from tensorboardX import SummaryWriter
from torch.nn.modules.loss import CrossEntropyLoss
from torch.utils.data import DataLoader
import torch.backends.cudnn as cudnn
from tqdm import tqdm
from torchvision import transforms
import time
from tqdm.notebook import tqdm
Trainer¶
def trainer_synapse(args, model, snapshot_path):
logging.basicConfig(filename=snapshot_path + "/log.txt", level=logging.INFO,
format='[%(asctime)s.%(msecs)03d] %(message)s', datefmt='%H:%M:%S')
logging.getLogger().addHandler(logging.StreamHandler(sys.stdout))
logging.info(str(args))
base_lr = args.base_lr
num_classes = args.num_classes
batch_size = args.batch_size * args.n_gpu
# Use GF7Dataset
db_train = GF7Dataset(
image_dir=args.image_dir,
mask_dir=args.mask_dir,
image_size=args.img_size,
transform= tranform_pipline # Add Albumentations if needed
)
print("The length of train set is: {}".format(len(db_train)))
def worker_init_fn(worker_id):
random.seed(args.seed + worker_id)
trainloader = DataLoader(
db_train,
batch_size=batch_size,
shuffle=True,
num_workers=0,
pin_memory=True,
worker_init_fn=worker_init_fn
)
if args.n_gpu > 1:
model = nn.DataParallel(model)
model.train()
ce_loss = CrossEntropyLoss()
dice_loss = DiceLoss(num_classes)
optimizer = optim.AdamW(model.parameters(), lr=base_lr, weight_decay=0.0001, amsgrad=True) # Changed to AdamW (Adam decreased training time by almost 50%) added Armsgrad
max_epoch = args.max_epochs
max_iterations = max_epoch * len(trainloader)
# Num restarts over full training
num_restarts = 3
# Compute adaptive T_0 based on total iterations
T_0 = max_iterations // num_restarts
scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(
optimizer,
T_0=T_0,
T_mult=1, # keep restarts evenly spaced
eta_min=1e-6 # stable minimum learning rate for Adam
)
writer = SummaryWriter(snapshot_path + '/log')
iter_num = 0
logging.info("{} iterations per epoch. {} max iterations ".format(len(trainloader), max_iterations))
start_time = time.time()
total_loss_sum = 0.0
total_loss_count = 0
conditional_saves_count = 0
max_conditional_saves = 5
iterator = tqdm(range(max_epoch), ncols=500, desc="Epoch", leave=False) # leave=False to avoid cluttering the output
for epoch_num in iterator:
for i_batch, (image_batch, label_batch) in enumerate(trainloader): # Tuple unpacking
image_batch, label_batch = image_batch.cuda(), label_batch.cuda()
outputs = model(image_batch)
loss_ce = ce_loss(outputs, label_batch.long().squeeze(1)) # squeeze if shape is [B, 1, H, W]
loss_dice = dice_loss(outputs, label_batch.squeeze(1), softmax=True)
loss = 0.5 * loss_ce + 0.5 * loss_dice
if iter_num > max_iterations - 10:
total_loss_sum += loss.item()
total_loss_count += 1
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Step the scheduler (use epoch + batch/len(trainloader) for smooth schedule)
scheduler.step(epoch_num + i_batch / len(trainloader))
# Retired learning rate decay, now using
#lr_ = base_lr * (1.0 - iter_num / max_iterations) ** 0.9 # Polynomial decay of Learning Rate, leftover from original code with SGD however still works with AdamW, sure that every update step doesn't exceed lambda
# for param_group in optimizer.param_groups:
# param_group['lr'] = lr_
iter_num += 1
current_lr = optimizer.param_groups[0]['lr']
writer.add_scalar('info/lr', current_lr, iter_num)
writer.add_scalar('info/total_loss', loss, iter_num)
writer.add_scalar('info/loss_ce', loss_ce, iter_num)
#logging.info('iteration %d : loss : %f, loss_ce: %f' % (iter_num, loss.item(), loss_ce.item()))
# Optionally, only log every N iterations
if iter_num % 10 == 0:
logging.info('iteration %d : loss : %f, loss_ce: %f' % (iter_num, loss.item(), loss_ce.item()))
if iter_num > max_iterations - 10:
logging.info('iteration %d : loss : %f, loss_ce: %f' % (iter_num, loss.item(), loss_ce.item()))
if iter_num % 20 == 0:
image = image_batch[1, 0:1, :, :]
image = (image - image.min()) / (image.max() - image.min())
writer.add_image('train/Image', image, iter_num)
outputs_vis = torch.argmax(torch.softmax(outputs, dim=1), dim=1, keepdim=True)
writer.add_image('train/Prediction', outputs_vis[1, ...] * 50, iter_num)
labs = label_batch[1, ...] * 50 # Remove .unsqueeze(0)
writer.add_image('train/GroundTruth', labs, iter_num)
save_interval = 50
if epoch_num > int(max_epoch / 2) and (epoch_num + 1) % save_interval == 0:
save_mode_path = os.path.join(snapshot_path, f'epoch_{epoch_num}.pth')
torch.save(model.state_dict(), save_mode_path)
logging.info(f"save model to {save_mode_path}")
save_interval_2 = 5 # Changes back To 10 After
if epoch_num > 48 and (epoch_num + 1) % save_interval_2 == 0:
save_mode_path = os.path.join(snapshot_path, f'epoch_{epoch_num}_iter_{iter_num}.pth')
torch.save(model.state_dict(), save_mode_path)
logging.info(f"save model to {save_mode_path}")
# If CE Loss Less Than 0.06 Save Model (Limited to 5 saves)
try:
if loss_ce.item() < 0.06 and conditional_saves_count < max_conditional_saves:
save_mode_path = os.path.join(snapshot_path, f'LOW_CE_epoch_{epoch_num}_iter_{iter_num}_loss_{loss_ce.item():.4f}.pth')
torch.save(model.state_dict(), save_mode_path)
logging.info(f"save model to {save_mode_path} with loss {loss_ce.item():.4f}")
conditional_saves_count += 1
logging.info(f"Conditional saves: {conditional_saves_count}/{max_conditional_saves}")
except Exception as e:
logging.warning(f"Failed to save model at epoch {epoch_num}, iter {iter_num}: {e}")
# Continue training without interruption
if epoch_num >= max_epoch - 1:
save_mode_path = os.path.join(snapshot_path, f'epoch_{epoch_num}.pth')
torch.save(model.state_dict(), save_mode_path)
logging.info(f"save model to {save_mode_path}")
iterator.close()
break
writer.close()
# Calculate and print total time and average seconds per iteration
total_time = time.time() - start_time
avg_time_per_iter = total_time / iter_num if iter_num > 0 else 0
avg_loss = total_loss_sum / total_loss_count if total_loss_count > 0 else 0
print("------Training Stats------")
print(f"Training finished in {total_time:.2f} seconds ({total_time/60:.2f} minutes).")
print(f"Average time per iteration: {avg_time_per_iter:.2f}s/it")
print(f"Average loss: {avg_loss:.4f}")
return "Training Finished!"
Training Arguments¶
import argparse
parser = argparse.ArgumentParser()
# Original args (unchanged)
parser.add_argument('--dataset', type=str, default='GF7')
parser.add_argument('--num_classes', type=int, default=2)
parser.add_argument('--max_iterations', type=int, default=30000)
parser.add_argument('--max_epochs', type=int, default=8)
parser.add_argument('--batch_size', type=int, default=8)
parser.add_argument('--n_gpu', type=int, default=1)
parser.add_argument('--deterministic', type=int, default=1) # Make it 1 for reproducibility
parser.add_argument('--base_lr', type=float, default=0.001)
parser.add_argument('--img_size', type=int, default=224)
parser.add_argument('--seed', type=int, default=42)
parser.add_argument('--n_skip', type=int, default=3)
parser.add_argument('--vit_name', type=str, default='R50-ViT-B_16')
parser.add_argument('--vit_patches_size', type=int, default=16)
# Add these two for GF7Dataset
parser.add_argument('--image_dir', type=str, help='Path to satellite images')
parser.add_argument('--mask_dir', type=str, help='Path to segmentation masks')
# set Epochs common variable and convert to string
epc = '163'
# Parse args manually for notebook
args = parser.parse_args(args=[
'--dataset', 'GF7',
'--num_classes', '2',
'--max_epochs', epc, # Should be 20 = 30 min and 50 = 1 hour
'--batch_size', '25', # This Was Optized for my Computer & Was changed From 20 to 25
'--n_gpu', '1',
'--base_lr', '0.001', # Changed from 0.001 to 0.0001 for better stability
'--img_size', '224',
'--seed', '42',
'--n_skip', '3',
'--vit_name', 'R50-ViT-B_16',
'--vit_patches_size', '16',
'--image_dir', 'data/GF-7 Building (3Bands)/Train/image', # Change this Back
'--mask_dir', 'data/GF-7 Building (3Bands)/Train/label' # Change This Back
])
print(args)
Namespace(dataset='GF7', num_classes=2, max_iterations=30000, max_epochs=163, batch_size=25, n_gpu=1, deterministic=1, base_lr=0.001, img_size=224, seed=42, n_skip=3, vit_name='R50-ViT-B_16', vit_patches_size=16, image_dir='data/GF-7 Building (3Bands)/Train/image', mask_dir='data/GF-7 Building (3Bands)/Train/label')
# -----------------------
# Environment Setup
# -----------------------
import os
import random
import numpy as np
import torch
from torch.backends import cudnn
if not args.deterministic:
cudnn.benchmark = True
cudnn.deterministic = False
else:
cudnn.benchmark = False
cudnn.deterministic = True
random.seed(args.seed)
np.random.seed(args.seed)
torch.manual_seed(args.seed)
torch.cuda.manual_seed(args.seed)
# -----------------------
# Dataset Configuration
# -----------------------
dataset_name = 'GF7'
dataset_config = {
'GF7': {
'image_dir': args.image_dir,
'mask_dir': args.mask_dir,
'num_classes': 2
}
}
if args.batch_size != 24 and args.batch_size % 6 == 0:
args.base_lr *= args.batch_size / 24
args.dataset = dataset_name
args.num_classes = dataset_config[dataset_name]['num_classes']
args.image_dir = dataset_config[dataset_name]['image_dir']
args.mask_dir = dataset_config[dataset_name]['mask_dir']
args.is_pretrain = True
# -----------------------
# Snapshot Path
# -----------------------
args.exp = f'TU_{dataset_name}{args.img_size}'
snapshot_path = f"model/{args.exp}/TU"
snapshot_path += '_pretrain' if args.is_pretrain else ''
snapshot_path += f"_{args.vit_name}_skip{args.n_skip}"
if args.vit_patches_size != 16:
snapshot_path += f"_vitpatch{args.vit_patches_size}"
if args.max_iterations != 30000:
snapshot_path += f"_{str(args.max_iterations)[:2]}k"
if args.max_epochs != 30:
snapshot_path += f"_epo{args.max_epochs}"
snapshot_path += f"_bs{args.batch_size}"
if args.base_lr != 0.01:
snapshot_path += f"_lr{args.base_lr}"
snapshot_path += f"_{args.img_size}"
if args.seed != 1234:
snapshot_path += f"_s{args.seed}"
# Create snapshot directory
if not os.path.exists(snapshot_path):
os.makedirs(snapshot_path)
# -----------------------
# ViT Config and Model
# -----------------------
#Assumes CONFIGS and VisionTransformer were already defined in earlier cells
config_vit = CONFIGS[args.vit_name]
config_vit.n_classes = args.num_classes
config_vit.n_skip = args.n_skip
config_vit.patches.size = (args.vit_patches_size, args.vit_patches_size)
if 'R50' in args.vit_name:
grid_size = int(args.img_size / args.vit_patches_size)
config_vit.patches.grid = (grid_size, grid_size)
# Build model
net = VisionTransformer(config_vit, img_size=args.img_size, num_classes=config_vit.n_classes).cuda()
# -----------------------
# Call Trainer
# -----------------------
to_train = 1
if to_train == 1:
trainer = {'GF7': trainer_synapse}
trainer[dataset_name](args, net, snapshot_path)
else:
print('Training Disabled in this notebook. Uncomment the last line to train the model.')
Namespace(dataset='GF7', num_classes=2, max_iterations=30000, max_epochs=163, batch_size=25, n_gpu=1, deterministic=1, base_lr=0.001, img_size=224, seed=42, n_skip=3, vit_name='R50-ViT-B_16', vit_patches_size=16, image_dir='data/GF-7 Building (3Bands)/Train/image', mask_dir='data/GF-7 Building (3Bands)/Train/label', is_pretrain=True, exp='TU_GF7224') The length of train set is: 3106 125 iterations per epoch. 20375 max iterations
Epoch: 0%| ā¦
iteration 10 : loss : 0.336394, loss_ce: 0.381616 iteration 20 : loss : 0.377212, loss_ce: 0.453101 iteration 30 : loss : 0.313751, loss_ce: 0.361445 iteration 40 : loss : 0.301531, loss_ce: 0.346111 iteration 50 : loss : 0.283308, loss_ce: 0.318545 iteration 60 : loss : 0.283654, loss_ce: 0.324121 iteration 70 : loss : 0.265691, loss_ce: 0.303735 iteration 80 : loss : 0.314381, loss_ce: 0.377045 iteration 90 : loss : 0.282240, loss_ce: 0.337154 iteration 100 : loss : 0.279577, loss_ce: 0.336668 iteration 110 : loss : 0.270506, loss_ce: 0.320831 iteration 120 : loss : 0.242190, loss_ce: 0.287483 iteration 130 : loss : 0.238785, loss_ce: 0.271927 iteration 140 : loss : 0.238487, loss_ce: 0.275511 iteration 150 : loss : 0.258746, loss_ce: 0.296714 iteration 160 : loss : 0.302955, loss_ce: 0.360846 iteration 170 : loss : 0.259516, loss_ce: 0.306980 iteration 180 : loss : 0.213133, loss_ce: 0.244600 iteration 190 : loss : 0.233652, loss_ce: 0.271762 iteration 200 : loss : 0.312271, loss_ce: 0.382965 iteration 210 : loss : 0.262339, loss_ce: 0.313153 iteration 220 : loss : 0.217594, loss_ce: 0.244657 iteration 230 : loss : 0.270938, loss_ce: 0.315713 iteration 240 : loss : 0.195922, loss_ce: 0.230183 iteration 250 : loss : 0.363110, loss_ce: 0.463339 iteration 260 : loss : 0.234748, loss_ce: 0.258183 iteration 270 : loss : 0.203454, loss_ce: 0.239257 iteration 280 : loss : 0.260863, loss_ce: 0.310662 iteration 290 : loss : 0.207386, loss_ce: 0.254807 iteration 300 : loss : 0.234931, loss_ce: 0.289112 iteration 310 : loss : 0.225256, loss_ce: 0.259783 iteration 320 : loss : 0.238669, loss_ce: 0.275243 iteration 330 : loss : 0.199239, loss_ce: 0.239285 iteration 340 : loss : 0.200147, loss_ce: 0.235756 iteration 350 : loss : 0.199007, loss_ce: 0.229110 iteration 360 : loss : 0.237030, loss_ce: 0.287460 iteration 370 : loss : 0.202248, loss_ce: 0.244368 iteration 380 : loss : 0.215112, loss_ce: 0.238273 iteration 390 : loss : 0.258419, loss_ce: 0.322606 iteration 400 : loss : 0.170827, loss_ce: 0.190041 iteration 410 : loss : 0.216367, loss_ce: 0.258295 iteration 420 : loss : 0.213083, loss_ce: 0.241252 iteration 430 : loss : 0.224040, loss_ce: 0.275907 iteration 440 : loss : 0.207150, loss_ce: 0.247432 iteration 450 : loss : 0.271496, loss_ce: 0.342697 iteration 460 : loss : 0.209615, loss_ce: 0.242132 iteration 470 : loss : 0.198297, loss_ce: 0.228929 iteration 480 : loss : 0.229728, loss_ce: 0.272525 iteration 490 : loss : 0.174491, loss_ce: 0.200004 iteration 500 : loss : 0.141818, loss_ce: 0.167250 iteration 510 : loss : 0.172623, loss_ce: 0.202727 iteration 520 : loss : 0.223784, loss_ce: 0.276039 iteration 530 : loss : 0.202529, loss_ce: 0.234023 iteration 540 : loss : 0.239584, loss_ce: 0.247712 iteration 550 : loss : 0.196162, loss_ce: 0.234639 iteration 560 : loss : 0.211497, loss_ce: 0.255601 iteration 570 : loss : 0.263588, loss_ce: 0.337072 iteration 580 : loss : 0.204987, loss_ce: 0.222094 iteration 590 : loss : 0.213874, loss_ce: 0.251177 iteration 600 : loss : 0.232156, loss_ce: 0.287476 iteration 610 : loss : 0.230212, loss_ce: 0.292716 iteration 620 : loss : 0.168221, loss_ce: 0.186499 iteration 630 : loss : 0.194751, loss_ce: 0.227969 iteration 640 : loss : 0.244145, loss_ce: 0.315067 iteration 650 : loss : 0.193916, loss_ce: 0.224002 iteration 660 : loss : 0.242250, loss_ce: 0.297129 iteration 670 : loss : 0.199195, loss_ce: 0.243315 iteration 680 : loss : 0.205480, loss_ce: 0.244916 iteration 690 : loss : 0.211004, loss_ce: 0.269980 iteration 700 : loss : 0.194724, loss_ce: 0.233068 iteration 710 : loss : 0.190865, loss_ce: 0.210218 iteration 720 : loss : 0.194541, loss_ce: 0.228918 iteration 730 : loss : 0.167526, loss_ce: 0.191256 iteration 740 : loss : 0.218215, loss_ce: 0.254852 iteration 750 : loss : 0.296383, loss_ce: 0.378787 iteration 760 : loss : 0.241421, loss_ce: 0.269111 iteration 770 : loss : 0.182978, loss_ce: 0.211125 iteration 780 : loss : 0.210315, loss_ce: 0.261704 iteration 790 : loss : 0.223838, loss_ce: 0.274760 iteration 800 : loss : 0.235370, loss_ce: 0.293531 iteration 810 : loss : 0.249149, loss_ce: 0.260007 iteration 820 : loss : 0.218314, loss_ce: 0.252479 iteration 830 : loss : 0.254888, loss_ce: 0.308343 iteration 840 : loss : 0.189127, loss_ce: 0.228525 iteration 850 : loss : 0.176453, loss_ce: 0.193933 iteration 860 : loss : 0.206198, loss_ce: 0.243048 iteration 870 : loss : 0.179537, loss_ce: 0.215041 iteration 880 : loss : 0.230320, loss_ce: 0.297305 iteration 890 : loss : 0.198230, loss_ce: 0.224515 iteration 900 : loss : 0.176200, loss_ce: 0.193940 iteration 910 : loss : 0.239386, loss_ce: 0.296351 iteration 920 : loss : 0.187687, loss_ce: 0.200655 iteration 930 : loss : 0.218135, loss_ce: 0.267720 iteration 940 : loss : 0.190475, loss_ce: 0.229263 iteration 950 : loss : 0.185408, loss_ce: 0.231377 iteration 960 : loss : 0.238462, loss_ce: 0.300376 iteration 970 : loss : 0.160891, loss_ce: 0.175062 iteration 980 : loss : 0.178876, loss_ce: 0.200280 iteration 990 : loss : 0.144806, loss_ce: 0.168551 iteration 1000 : loss : 0.242546, loss_ce: 0.315109 iteration 1010 : loss : 0.227598, loss_ce: 0.286672 iteration 1020 : loss : 0.184621, loss_ce: 0.222178 iteration 1030 : loss : 0.156234, loss_ce: 0.177050 iteration 1040 : loss : 0.193741, loss_ce: 0.243864 iteration 1050 : loss : 0.181239, loss_ce: 0.212335 iteration 1060 : loss : 0.173115, loss_ce: 0.213087 iteration 1070 : loss : 0.175209, loss_ce: 0.205562 iteration 1080 : loss : 0.178152, loss_ce: 0.216612 iteration 1090 : loss : 0.202467, loss_ce: 0.212887 iteration 1100 : loss : 0.258125, loss_ce: 0.320028 iteration 1110 : loss : 0.197252, loss_ce: 0.242545 iteration 1120 : loss : 0.200405, loss_ce: 0.253015 iteration 1130 : loss : 0.180233, loss_ce: 0.210245 iteration 1140 : loss : 0.185404, loss_ce: 0.227097 iteration 1150 : loss : 0.179439, loss_ce: 0.216343 iteration 1160 : loss : 0.180377, loss_ce: 0.213853 iteration 1170 : loss : 0.145791, loss_ce: 0.182530 iteration 1180 : loss : 0.195135, loss_ce: 0.247594 iteration 1190 : loss : 0.155346, loss_ce: 0.183014 iteration 1200 : loss : 0.228380, loss_ce: 0.265850 iteration 1210 : loss : 0.159668, loss_ce: 0.186420 iteration 1220 : loss : 0.166974, loss_ce: 0.194476 iteration 1230 : loss : 0.197009, loss_ce: 0.244469 iteration 1240 : loss : 0.160821, loss_ce: 0.192176 iteration 1250 : loss : 0.171389, loss_ce: 0.200223 iteration 1260 : loss : 0.163498, loss_ce: 0.178645 iteration 1270 : loss : 0.171598, loss_ce: 0.217591 iteration 1280 : loss : 0.168810, loss_ce: 0.194214 iteration 1290 : loss : 0.158936, loss_ce: 0.182542 iteration 1300 : loss : 0.152929, loss_ce: 0.179948 iteration 1310 : loss : 0.170747, loss_ce: 0.210832 iteration 1320 : loss : 0.179682, loss_ce: 0.219227 iteration 1330 : loss : 0.202571, loss_ce: 0.250114 iteration 1340 : loss : 0.187286, loss_ce: 0.231699 iteration 1350 : loss : 0.179228, loss_ce: 0.222194 iteration 1360 : loss : 0.169046, loss_ce: 0.207721 iteration 1370 : loss : 0.155983, loss_ce: 0.177197 iteration 1380 : loss : 0.199197, loss_ce: 0.256679 iteration 1390 : loss : 0.179060, loss_ce: 0.212368 iteration 1400 : loss : 0.178831, loss_ce: 0.221810 iteration 1410 : loss : 0.173423, loss_ce: 0.216880 iteration 1420 : loss : 0.183398, loss_ce: 0.220990 iteration 1430 : loss : 0.141734, loss_ce: 0.166175 iteration 1440 : loss : 0.168408, loss_ce: 0.198401 iteration 1450 : loss : 0.179937, loss_ce: 0.220221 iteration 1460 : loss : 0.152269, loss_ce: 0.161798 iteration 1470 : loss : 0.173141, loss_ce: 0.219761 iteration 1480 : loss : 0.153467, loss_ce: 0.177957 iteration 1490 : loss : 0.153130, loss_ce: 0.183448 iteration 1500 : loss : 0.252278, loss_ce: 0.312859 iteration 1510 : loss : 0.187955, loss_ce: 0.214262 iteration 1520 : loss : 0.150611, loss_ce: 0.178317 iteration 1530 : loss : 0.171578, loss_ce: 0.206280 iteration 1540 : loss : 0.184588, loss_ce: 0.222406 iteration 1550 : loss : 0.165336, loss_ce: 0.186370 iteration 1560 : loss : 0.137427, loss_ce: 0.166861 iteration 1570 : loss : 0.151750, loss_ce: 0.185742 iteration 1580 : loss : 0.146867, loss_ce: 0.161108 iteration 1590 : loss : 0.161126, loss_ce: 0.201162 iteration 1600 : loss : 0.170134, loss_ce: 0.197918 iteration 1610 : loss : 0.151966, loss_ce: 0.178201 iteration 1620 : loss : 0.197005, loss_ce: 0.248227 iteration 1630 : loss : 0.186573, loss_ce: 0.224932 iteration 1640 : loss : 0.153377, loss_ce: 0.173929 iteration 1650 : loss : 0.173921, loss_ce: 0.189104 iteration 1660 : loss : 0.146894, loss_ce: 0.172128 iteration 1670 : loss : 0.174645, loss_ce: 0.205740 iteration 1680 : loss : 0.149567, loss_ce: 0.186751 iteration 1690 : loss : 0.162901, loss_ce: 0.194508 iteration 1700 : loss : 0.178603, loss_ce: 0.227133 iteration 1710 : loss : 0.159511, loss_ce: 0.201643 iteration 1720 : loss : 0.149118, loss_ce: 0.176516 iteration 1730 : loss : 0.150821, loss_ce: 0.183148 iteration 1740 : loss : 0.143737, loss_ce: 0.176546 iteration 1750 : loss : 0.163035, loss_ce: 0.205633 iteration 1760 : loss : 0.161091, loss_ce: 0.197122 iteration 1770 : loss : 0.157320, loss_ce: 0.179112 iteration 1780 : loss : 0.149995, loss_ce: 0.180151 iteration 1790 : loss : 0.150067, loss_ce: 0.179141 iteration 1800 : loss : 0.201821, loss_ce: 0.249313 iteration 1810 : loss : 0.156339, loss_ce: 0.188611 iteration 1820 : loss : 0.158280, loss_ce: 0.180493 iteration 1830 : loss : 0.171094, loss_ce: 0.197447 iteration 1840 : loss : 0.153964, loss_ce: 0.187105 iteration 1850 : loss : 0.155911, loss_ce: 0.194407 iteration 1860 : loss : 0.159395, loss_ce: 0.196519 iteration 1870 : loss : 0.172997, loss_ce: 0.218635 iteration 1880 : loss : 0.147701, loss_ce: 0.183114 iteration 1890 : loss : 0.156490, loss_ce: 0.175430 iteration 1900 : loss : 0.157782, loss_ce: 0.191438 iteration 1910 : loss : 0.164478, loss_ce: 0.188315 iteration 1920 : loss : 0.133842, loss_ce: 0.151633 iteration 1930 : loss : 0.155579, loss_ce: 0.192765 iteration 1940 : loss : 0.165751, loss_ce: 0.205501 iteration 1950 : loss : 0.149776, loss_ce: 0.188986 iteration 1960 : loss : 0.152851, loss_ce: 0.179697 iteration 1970 : loss : 0.184177, loss_ce: 0.234849 iteration 1980 : loss : 0.153118, loss_ce: 0.184282 iteration 1990 : loss : 0.157322, loss_ce: 0.200337 iteration 2000 : loss : 0.172451, loss_ce: 0.172074 iteration 2010 : loss : 0.152206, loss_ce: 0.181684 iteration 2020 : loss : 0.151822, loss_ce: 0.190398 iteration 2030 : loss : 0.166832, loss_ce: 0.203242 iteration 2040 : loss : 0.165843, loss_ce: 0.185735 iteration 2050 : loss : 0.143704, loss_ce: 0.174149 iteration 2060 : loss : 0.162966, loss_ce: 0.206919 iteration 2070 : loss : 0.147184, loss_ce: 0.171992 iteration 2080 : loss : 0.153760, loss_ce: 0.185859 iteration 2090 : loss : 0.167172, loss_ce: 0.202048 iteration 2100 : loss : 0.205013, loss_ce: 0.264152 iteration 2110 : loss : 0.144865, loss_ce: 0.186279 iteration 2120 : loss : 0.155832, loss_ce: 0.189809 iteration 2130 : loss : 0.164428, loss_ce: 0.208761 iteration 2140 : loss : 0.153753, loss_ce: 0.182729 iteration 2150 : loss : 0.155416, loss_ce: 0.163270 iteration 2160 : loss : 0.128080, loss_ce: 0.150323 iteration 2170 : loss : 0.131390, loss_ce: 0.152038 iteration 2180 : loss : 0.123677, loss_ce: 0.142371 iteration 2190 : loss : 0.125801, loss_ce: 0.142139 iteration 2200 : loss : 0.168104, loss_ce: 0.212895 iteration 2210 : loss : 0.140444, loss_ce: 0.176591 iteration 2220 : loss : 0.156469, loss_ce: 0.194589 iteration 2230 : loss : 0.161884, loss_ce: 0.199729 iteration 2240 : loss : 0.174184, loss_ce: 0.211115 iteration 2250 : loss : 0.185773, loss_ce: 0.189475 iteration 2260 : loss : 0.150970, loss_ce: 0.181919 iteration 2270 : loss : 0.127634, loss_ce: 0.148461 iteration 2280 : loss : 0.156707, loss_ce: 0.196087 iteration 2290 : loss : 0.149712, loss_ce: 0.195421 iteration 2300 : loss : 0.133588, loss_ce: 0.160612 iteration 2310 : loss : 0.149479, loss_ce: 0.193109 iteration 2320 : loss : 0.147187, loss_ce: 0.167853 iteration 2330 : loss : 0.126025, loss_ce: 0.137756 iteration 2340 : loss : 0.124349, loss_ce: 0.157055 iteration 2350 : loss : 0.179670, loss_ce: 0.220855 iteration 2360 : loss : 0.161586, loss_ce: 0.194376 iteration 2370 : loss : 0.150664, loss_ce: 0.194330 iteration 2380 : loss : 0.118788, loss_ce: 0.134142 iteration 2390 : loss : 0.129645, loss_ce: 0.158928 iteration 2400 : loss : 0.147394, loss_ce: 0.172569 iteration 2410 : loss : 0.149473, loss_ce: 0.185319 iteration 2420 : loss : 0.159138, loss_ce: 0.186505 iteration 2430 : loss : 0.143047, loss_ce: 0.168784 iteration 2440 : loss : 0.147526, loss_ce: 0.184315 iteration 2450 : loss : 0.148257, loss_ce: 0.176039 iteration 2460 : loss : 0.184151, loss_ce: 0.228535 iteration 2470 : loss : 0.130763, loss_ce: 0.155238 iteration 2480 : loss : 0.157645, loss_ce: 0.203065 iteration 2490 : loss : 0.146159, loss_ce: 0.175782 iteration 2500 : loss : 0.132778, loss_ce: 0.163740 iteration 2510 : loss : 0.152308, loss_ce: 0.197459 iteration 2520 : loss : 0.176866, loss_ce: 0.219112 iteration 2530 : loss : 0.137329, loss_ce: 0.166985 iteration 2540 : loss : 0.138768, loss_ce: 0.160166 iteration 2550 : loss : 0.155396, loss_ce: 0.198376 iteration 2560 : loss : 0.138207, loss_ce: 0.165615 iteration 2570 : loss : 0.121258, loss_ce: 0.146939 iteration 2580 : loss : 0.137629, loss_ce: 0.168620 iteration 2590 : loss : 0.146506, loss_ce: 0.184083 iteration 2600 : loss : 0.156465, loss_ce: 0.177616 iteration 2610 : loss : 0.157283, loss_ce: 0.182893 iteration 2620 : loss : 0.140518, loss_ce: 0.169519 iteration 2630 : loss : 0.126353, loss_ce: 0.141385 iteration 2640 : loss : 0.134191, loss_ce: 0.154923 iteration 2650 : loss : 0.121364, loss_ce: 0.145136 iteration 2660 : loss : 0.145097, loss_ce: 0.175193 iteration 2670 : loss : 0.150683, loss_ce: 0.192007 iteration 2680 : loss : 0.168544, loss_ce: 0.208966 iteration 2690 : loss : 0.138882, loss_ce: 0.172999 iteration 2700 : loss : 0.125030, loss_ce: 0.138900 iteration 2710 : loss : 0.143386, loss_ce: 0.183220 iteration 2720 : loss : 0.139169, loss_ce: 0.164439 iteration 2730 : loss : 0.138192, loss_ce: 0.161964 iteration 2740 : loss : 0.150136, loss_ce: 0.189114 iteration 2750 : loss : 0.162040, loss_ce: 0.179460 iteration 2760 : loss : 0.138223, loss_ce: 0.173213 iteration 2770 : loss : 0.120451, loss_ce: 0.137277 iteration 2780 : loss : 0.147341, loss_ce: 0.173673 iteration 2790 : loss : 0.165696, loss_ce: 0.197485 iteration 2800 : loss : 0.147209, loss_ce: 0.177964 iteration 2810 : loss : 0.174339, loss_ce: 0.221234 iteration 2820 : loss : 0.166012, loss_ce: 0.215982 iteration 2830 : loss : 0.154921, loss_ce: 0.186515 iteration 2840 : loss : 0.124249, loss_ce: 0.137165 iteration 2850 : loss : 0.150744, loss_ce: 0.179873 iteration 2860 : loss : 0.155428, loss_ce: 0.197080 iteration 2870 : loss : 0.188417, loss_ce: 0.257254 iteration 2880 : loss : 0.145012, loss_ce: 0.175630 iteration 2890 : loss : 0.156089, loss_ce: 0.195294 iteration 2900 : loss : 0.151110, loss_ce: 0.188505 iteration 2910 : loss : 0.136103, loss_ce: 0.166780 iteration 2920 : loss : 0.132481, loss_ce: 0.152741 iteration 2930 : loss : 0.142139, loss_ce: 0.179499 iteration 2940 : loss : 0.118729, loss_ce: 0.147705 iteration 2950 : loss : 0.116075, loss_ce: 0.144209 iteration 2960 : loss : 0.108595, loss_ce: 0.133837 iteration 2970 : loss : 0.147951, loss_ce: 0.184179 iteration 2980 : loss : 0.157063, loss_ce: 0.176537 iteration 2990 : loss : 0.181870, loss_ce: 0.234823 iteration 3000 : loss : 0.128708, loss_ce: 0.157460 iteration 3010 : loss : 0.136163, loss_ce: 0.164923 iteration 3020 : loss : 0.142297, loss_ce: 0.166557 iteration 3030 : loss : 0.137621, loss_ce: 0.160329 iteration 3040 : loss : 0.118549, loss_ce: 0.141236 iteration 3050 : loss : 0.140575, loss_ce: 0.174677 iteration 3060 : loss : 0.137013, loss_ce: 0.159671 iteration 3070 : loss : 0.119667, loss_ce: 0.152801 iteration 3080 : loss : 0.172731, loss_ce: 0.223497 iteration 3090 : loss : 0.099919, loss_ce: 0.115618 iteration 3100 : loss : 0.135926, loss_ce: 0.166757 iteration 3110 : loss : 0.139225, loss_ce: 0.177851 iteration 3120 : loss : 0.152394, loss_ce: 0.190762 iteration 3130 : loss : 0.143934, loss_ce: 0.180965 iteration 3140 : loss : 0.150466, loss_ce: 0.196860 iteration 3150 : loss : 0.141108, loss_ce: 0.170404 iteration 3160 : loss : 0.140215, loss_ce: 0.177934 iteration 3170 : loss : 0.171456, loss_ce: 0.216242 iteration 3180 : loss : 0.129338, loss_ce: 0.151985 iteration 3190 : loss : 0.134514, loss_ce: 0.158083 iteration 3200 : loss : 0.126987, loss_ce: 0.153771 iteration 3210 : loss : 0.139819, loss_ce: 0.172194 iteration 3220 : loss : 0.121887, loss_ce: 0.151605 iteration 3230 : loss : 0.131339, loss_ce: 0.170204 iteration 3240 : loss : 0.120098, loss_ce: 0.149165 iteration 3250 : loss : 0.173852, loss_ce: 0.214871 iteration 3260 : loss : 0.148439, loss_ce: 0.166423 iteration 3270 : loss : 0.119295, loss_ce: 0.139306 iteration 3280 : loss : 0.115043, loss_ce: 0.148194 iteration 3290 : loss : 0.141673, loss_ce: 0.174449 iteration 3300 : loss : 0.123628, loss_ce: 0.137120 iteration 3310 : loss : 0.109024, loss_ce: 0.134446 iteration 3320 : loss : 0.132862, loss_ce: 0.164133 iteration 3330 : loss : 0.165598, loss_ce: 0.207269 iteration 3340 : loss : 0.152173, loss_ce: 0.185903 iteration 3350 : loss : 0.141158, loss_ce: 0.181285 iteration 3360 : loss : 0.107792, loss_ce: 0.132854 iteration 3370 : loss : 0.109871, loss_ce: 0.133359 iteration 3380 : loss : 0.140910, loss_ce: 0.178085 iteration 3390 : loss : 0.128819, loss_ce: 0.158937 iteration 3400 : loss : 0.143894, loss_ce: 0.181838 iteration 3410 : loss : 0.132635, loss_ce: 0.153296 iteration 3420 : loss : 0.117686, loss_ce: 0.138058 iteration 3430 : loss : 0.152625, loss_ce: 0.191958 iteration 3440 : loss : 0.120543, loss_ce: 0.147168 iteration 3450 : loss : 0.129081, loss_ce: 0.152879 iteration 3460 : loss : 0.141535, loss_ce: 0.156027 iteration 3470 : loss : 0.132588, loss_ce: 0.160829 iteration 3480 : loss : 0.137007, loss_ce: 0.171075 iteration 3490 : loss : 0.151011, loss_ce: 0.191961 iteration 3500 : loss : 0.113651, loss_ce: 0.112321 iteration 3510 : loss : 0.142829, loss_ce: 0.158991 iteration 3520 : loss : 0.156333, loss_ce: 0.201156 iteration 3530 : loss : 0.132488, loss_ce: 0.164568 iteration 3540 : loss : 0.131296, loss_ce: 0.161802 iteration 3550 : loss : 0.148189, loss_ce: 0.190063 iteration 3560 : loss : 0.110935, loss_ce: 0.130662 iteration 3570 : loss : 0.110816, loss_ce: 0.141116 iteration 3580 : loss : 0.122273, loss_ce: 0.158248 iteration 3590 : loss : 0.134085, loss_ce: 0.161306 iteration 3600 : loss : 0.113243, loss_ce: 0.134094 iteration 3610 : loss : 0.122231, loss_ce: 0.138563 iteration 3620 : loss : 0.153460, loss_ce: 0.199976 iteration 3630 : loss : 0.142472, loss_ce: 0.184160 iteration 3640 : loss : 0.135336, loss_ce: 0.150750 iteration 3650 : loss : 0.122537, loss_ce: 0.152237 iteration 3660 : loss : 0.156903, loss_ce: 0.180064 iteration 3670 : loss : 0.135537, loss_ce: 0.160238 iteration 3680 : loss : 0.127540, loss_ce: 0.148726 iteration 3690 : loss : 0.105657, loss_ce: 0.122841 iteration 3700 : loss : 0.151236, loss_ce: 0.194750 iteration 3710 : loss : 0.137511, loss_ce: 0.177146 iteration 3720 : loss : 0.139362, loss_ce: 0.166809 iteration 3730 : loss : 0.115611, loss_ce: 0.141552 iteration 3740 : loss : 0.117165, loss_ce: 0.139908 iteration 3750 : loss : 0.118369, loss_ce: 0.148879 iteration 3760 : loss : 0.127699, loss_ce: 0.161241 iteration 3770 : loss : 0.127217, loss_ce: 0.151117 iteration 3780 : loss : 0.133513, loss_ce: 0.171123 iteration 3790 : loss : 0.118957, loss_ce: 0.137007 iteration 3800 : loss : 0.117534, loss_ce: 0.138586 iteration 3810 : loss : 0.115033, loss_ce: 0.142838 iteration 3820 : loss : 0.140590, loss_ce: 0.169340 iteration 3830 : loss : 0.134565, loss_ce: 0.151488 iteration 3840 : loss : 0.129432, loss_ce: 0.156876 iteration 3850 : loss : 0.126814, loss_ce: 0.152696 iteration 3860 : loss : 0.133126, loss_ce: 0.162847 iteration 3870 : loss : 0.113741, loss_ce: 0.133882 iteration 3880 : loss : 0.134455, loss_ce: 0.141383 iteration 3890 : loss : 0.144357, loss_ce: 0.186364 iteration 3900 : loss : 0.122137, loss_ce: 0.144869 iteration 3910 : loss : 0.167036, loss_ce: 0.210711 iteration 3920 : loss : 0.132062, loss_ce: 0.164024 iteration 3930 : loss : 0.159093, loss_ce: 0.199515 iteration 3940 : loss : 0.121027, loss_ce: 0.134392 iteration 3950 : loss : 0.114049, loss_ce: 0.143787 iteration 3960 : loss : 0.176792, loss_ce: 0.234764 iteration 3970 : loss : 0.120040, loss_ce: 0.136344 iteration 3980 : loss : 0.144031, loss_ce: 0.185956 iteration 3990 : loss : 0.126621, loss_ce: 0.132754 iteration 4000 : loss : 0.143560, loss_ce: 0.195910 iteration 4010 : loss : 0.120002, loss_ce: 0.144525 iteration 4020 : loss : 0.141839, loss_ce: 0.178676 iteration 4030 : loss : 0.155603, loss_ce: 0.207802 iteration 4040 : loss : 0.102861, loss_ce: 0.126258 iteration 4050 : loss : 0.122243, loss_ce: 0.144322 iteration 4060 : loss : 0.091037, loss_ce: 0.103384 iteration 4070 : loss : 0.129153, loss_ce: 0.164820 iteration 4080 : loss : 0.119284, loss_ce: 0.133608 iteration 4090 : loss : 0.129769, loss_ce: 0.140768 iteration 4100 : loss : 0.122566, loss_ce: 0.156385 iteration 4110 : loss : 0.151877, loss_ce: 0.189081 iteration 4120 : loss : 0.117048, loss_ce: 0.142109 iteration 4130 : loss : 0.110472, loss_ce: 0.138764 iteration 4140 : loss : 0.116738, loss_ce: 0.133504 iteration 4150 : loss : 0.145730, loss_ce: 0.183727 iteration 4160 : loss : 0.118281, loss_ce: 0.145384 iteration 4170 : loss : 0.137890, loss_ce: 0.159307 iteration 4180 : loss : 0.123907, loss_ce: 0.153481 iteration 4190 : loss : 0.127260, loss_ce: 0.164565 iteration 4200 : loss : 0.110732, loss_ce: 0.125762 iteration 4210 : loss : 0.143986, loss_ce: 0.179820 iteration 4220 : loss : 0.125120, loss_ce: 0.148233 iteration 4230 : loss : 0.097114, loss_ce: 0.116594 iteration 4240 : loss : 0.113016, loss_ce: 0.130493 iteration 4250 : loss : 0.098822, loss_ce: 0.127056 iteration 4260 : loss : 0.111727, loss_ce: 0.138495 iteration 4270 : loss : 0.114439, loss_ce: 0.131057 iteration 4280 : loss : 0.144935, loss_ce: 0.191598 iteration 4290 : loss : 0.133130, loss_ce: 0.169959 iteration 4300 : loss : 0.128658, loss_ce: 0.163912 iteration 4310 : loss : 0.110300, loss_ce: 0.134108 iteration 4320 : loss : 0.101082, loss_ce: 0.112039 iteration 4330 : loss : 0.091877, loss_ce: 0.108069 iteration 4340 : loss : 0.123878, loss_ce: 0.155889 iteration 4350 : loss : 0.173797, loss_ce: 0.242269 iteration 4360 : loss : 0.115839, loss_ce: 0.142456 iteration 4370 : loss : 0.144541, loss_ce: 0.185028 iteration 4380 : loss : 0.128266, loss_ce: 0.158897 iteration 4390 : loss : 0.121240, loss_ce: 0.151298 iteration 4400 : loss : 0.139106, loss_ce: 0.178968 iteration 4410 : loss : 0.143477, loss_ce: 0.192823 iteration 4420 : loss : 0.129697, loss_ce: 0.165015 iteration 4430 : loss : 0.129299, loss_ce: 0.150114 iteration 4440 : loss : 0.133932, loss_ce: 0.168165 iteration 4450 : loss : 0.150331, loss_ce: 0.179035 iteration 4460 : loss : 0.134502, loss_ce: 0.168363 iteration 4470 : loss : 0.122922, loss_ce: 0.147724 iteration 4480 : loss : 0.127204, loss_ce: 0.164350 iteration 4490 : loss : 0.101349, loss_ce: 0.115462 iteration 4500 : loss : 0.125536, loss_ce: 0.171405 iteration 4510 : loss : 0.142797, loss_ce: 0.186181 iteration 4520 : loss : 0.166582, loss_ce: 0.212427 iteration 4530 : loss : 0.137479, loss_ce: 0.173866 iteration 4540 : loss : 0.129922, loss_ce: 0.156234 iteration 4550 : loss : 0.135904, loss_ce: 0.153074 iteration 4560 : loss : 0.130035, loss_ce: 0.158643 iteration 4570 : loss : 0.128521, loss_ce: 0.160670 iteration 4580 : loss : 0.138316, loss_ce: 0.170779 iteration 4590 : loss : 0.121758, loss_ce: 0.143205 iteration 4600 : loss : 0.124293, loss_ce: 0.147549 iteration 4610 : loss : 0.130381, loss_ce: 0.160796 iteration 4620 : loss : 0.118347, loss_ce: 0.152102 iteration 4630 : loss : 0.143081, loss_ce: 0.191820 iteration 4640 : loss : 0.127386, loss_ce: 0.163459 iteration 4650 : loss : 0.109801, loss_ce: 0.129891 iteration 4660 : loss : 0.120090, loss_ce: 0.155997 iteration 4670 : loss : 0.147363, loss_ce: 0.182566 iteration 4680 : loss : 0.119668, loss_ce: 0.152482 iteration 4690 : loss : 0.102131, loss_ce: 0.124707 iteration 4700 : loss : 0.122549, loss_ce: 0.151529 iteration 4710 : loss : 0.113740, loss_ce: 0.138969 iteration 4720 : loss : 0.131727, loss_ce: 0.160798 iteration 4730 : loss : 0.119113, loss_ce: 0.143263 iteration 4740 : loss : 0.115843, loss_ce: 0.146097 iteration 4750 : loss : 0.184091, loss_ce: 0.203325 iteration 4760 : loss : 0.109944, loss_ce: 0.131668 iteration 4770 : loss : 0.163340, loss_ce: 0.209889 iteration 4780 : loss : 0.126126, loss_ce: 0.147611 iteration 4790 : loss : 0.117046, loss_ce: 0.148818 iteration 4800 : loss : 0.123652, loss_ce: 0.153907 iteration 4810 : loss : 0.123298, loss_ce: 0.146975 iteration 4820 : loss : 0.121187, loss_ce: 0.126883 iteration 4830 : loss : 0.120594, loss_ce: 0.140078 iteration 4840 : loss : 0.125474, loss_ce: 0.159889 iteration 4850 : loss : 0.126095, loss_ce: 0.162708 iteration 4860 : loss : 0.107752, loss_ce: 0.131507 iteration 4870 : loss : 0.117740, loss_ce: 0.146291 iteration 4880 : loss : 0.107939, loss_ce: 0.138984 iteration 4890 : loss : 0.111279, loss_ce: 0.138337 iteration 4900 : loss : 0.105693, loss_ce: 0.124305 iteration 4910 : loss : 0.106075, loss_ce: 0.123986 iteration 4920 : loss : 0.111962, loss_ce: 0.129951 iteration 4930 : loss : 0.106150, loss_ce: 0.133392 iteration 4940 : loss : 0.132852, loss_ce: 0.169257 iteration 4950 : loss : 0.118361, loss_ce: 0.143444 iteration 4960 : loss : 0.110912, loss_ce: 0.128192 iteration 4970 : loss : 0.110691, loss_ce: 0.141839 iteration 4980 : loss : 0.109071, loss_ce: 0.132727 iteration 4990 : loss : 0.117557, loss_ce: 0.143254 iteration 5000 : loss : 0.123542, loss_ce: 0.153021 iteration 5010 : loss : 0.095667, loss_ce: 0.104485 iteration 5020 : loss : 0.120676, loss_ce: 0.139621 iteration 5030 : loss : 0.113831, loss_ce: 0.144049 iteration 5040 : loss : 0.110923, loss_ce: 0.131858 iteration 5050 : loss : 0.121853, loss_ce: 0.156782 iteration 5060 : loss : 0.117998, loss_ce: 0.148413 iteration 5070 : loss : 0.114370, loss_ce: 0.143706 iteration 5080 : loss : 0.123490, loss_ce: 0.143680 iteration 5090 : loss : 0.121321, loss_ce: 0.146236 iteration 5100 : loss : 0.137482, loss_ce: 0.168496 iteration 5110 : loss : 0.114910, loss_ce: 0.143895 iteration 5120 : loss : 0.121721, loss_ce: 0.150157 iteration 5130 : loss : 0.132670, loss_ce: 0.171489 iteration 5140 : loss : 0.105485, loss_ce: 0.124225 iteration 5150 : loss : 0.116671, loss_ce: 0.137609 iteration 5160 : loss : 0.106883, loss_ce: 0.137500 iteration 5170 : loss : 0.107928, loss_ce: 0.128597 iteration 5180 : loss : 0.142686, loss_ce: 0.178644 iteration 5190 : loss : 0.116949, loss_ce: 0.147906 iteration 5200 : loss : 0.118482, loss_ce: 0.152155 iteration 5210 : loss : 0.129186, loss_ce: 0.165796 iteration 5220 : loss : 0.135880, loss_ce: 0.159184 iteration 5230 : loss : 0.120869, loss_ce: 0.150028 iteration 5240 : loss : 0.101240, loss_ce: 0.122978 iteration 5250 : loss : 0.078907, loss_ce: 0.080493 iteration 5260 : loss : 0.123135, loss_ce: 0.151494 iteration 5270 : loss : 0.125019, loss_ce: 0.158179 iteration 5280 : loss : 0.103107, loss_ce: 0.119954 iteration 5290 : loss : 0.117743, loss_ce: 0.148700 iteration 5300 : loss : 0.155124, loss_ce: 0.201566 iteration 5310 : loss : 0.116892, loss_ce: 0.142722 iteration 5320 : loss : 0.120828, loss_ce: 0.138780 iteration 5330 : loss : 0.099739, loss_ce: 0.122534 iteration 5340 : loss : 0.117081, loss_ce: 0.142902 iteration 5350 : loss : 0.107851, loss_ce: 0.133204 iteration 5360 : loss : 0.122144, loss_ce: 0.147259 iteration 5370 : loss : 0.101729, loss_ce: 0.124264 iteration 5380 : loss : 0.108096, loss_ce: 0.126647 iteration 5390 : loss : 0.125839, loss_ce: 0.165462 iteration 5400 : loss : 0.101617, loss_ce: 0.120327 iteration 5410 : loss : 0.114923, loss_ce: 0.139907 iteration 5420 : loss : 0.093157, loss_ce: 0.094876 iteration 5430 : loss : 0.128488, loss_ce: 0.172775 iteration 5440 : loss : 0.073646, loss_ce: 0.085357 iteration 5450 : loss : 0.115801, loss_ce: 0.150583 iteration 5460 : loss : 0.126177, loss_ce: 0.162867 iteration 5470 : loss : 0.156598, loss_ce: 0.204407 iteration 5480 : loss : 0.101732, loss_ce: 0.121851 iteration 5490 : loss : 0.102098, loss_ce: 0.124483 iteration 5500 : loss : 0.132767, loss_ce: 0.107060 iteration 5510 : loss : 0.098253, loss_ce: 0.114319 iteration 5520 : loss : 0.109183, loss_ce: 0.134961 iteration 5530 : loss : 0.126892, loss_ce: 0.152996 iteration 5540 : loss : 0.098990, loss_ce: 0.115150 iteration 5550 : loss : 0.101534, loss_ce: 0.113431 iteration 5560 : loss : 0.114644, loss_ce: 0.142925 iteration 5570 : loss : 0.114383, loss_ce: 0.138787 iteration 5580 : loss : 0.104246, loss_ce: 0.126862 iteration 5590 : loss : 0.118061, loss_ce: 0.155779 iteration 5600 : loss : 0.123311, loss_ce: 0.163650 iteration 5610 : loss : 0.102641, loss_ce: 0.131498 iteration 5620 : loss : 0.119503, loss_ce: 0.148475 iteration 5630 : loss : 0.117346, loss_ce: 0.132718 iteration 5640 : loss : 0.115054, loss_ce: 0.145918 iteration 5650 : loss : 0.101578, loss_ce: 0.119622 iteration 5660 : loss : 0.100288, loss_ce: 0.128063 iteration 5670 : loss : 0.117952, loss_ce: 0.144642 iteration 5680 : loss : 0.114499, loss_ce: 0.145402 iteration 5690 : loss : 0.133461, loss_ce: 0.179263 iteration 5700 : loss : 0.121755, loss_ce: 0.141186 iteration 5710 : loss : 0.116914, loss_ce: 0.143054 iteration 5720 : loss : 0.090847, loss_ce: 0.114031 iteration 5730 : loss : 0.125316, loss_ce: 0.158227 iteration 5740 : loss : 0.092816, loss_ce: 0.116250 iteration 5750 : loss : 0.114226, loss_ce: 0.134633 iteration 5760 : loss : 0.090892, loss_ce: 0.107630 iteration 5770 : loss : 0.106424, loss_ce: 0.126621 iteration 5780 : loss : 0.096488, loss_ce: 0.118481 iteration 5790 : loss : 0.092673, loss_ce: 0.102341 iteration 5800 : loss : 0.107170, loss_ce: 0.129360 iteration 5810 : loss : 0.113334, loss_ce: 0.138858 iteration 5820 : loss : 0.119754, loss_ce: 0.142068 iteration 5830 : loss : 0.107074, loss_ce: 0.125927 iteration 5840 : loss : 0.137078, loss_ce: 0.171291 iteration 5850 : loss : 0.114197, loss_ce: 0.153946 iteration 5860 : loss : 0.111141, loss_ce: 0.142323 iteration 5870 : loss : 0.105295, loss_ce: 0.143670 iteration 5880 : loss : 0.100658, loss_ce: 0.122268 iteration 5890 : loss : 0.109694, loss_ce: 0.143002 iteration 5900 : loss : 0.103372, loss_ce: 0.129898 iteration 5910 : loss : 0.158243, loss_ce: 0.208524 iteration 5920 : loss : 0.111946, loss_ce: 0.133743 iteration 5930 : loss : 0.096535, loss_ce: 0.111055 iteration 5940 : loss : 0.103193, loss_ce: 0.123513 iteration 5950 : loss : 0.111230, loss_ce: 0.136320 iteration 5960 : loss : 0.117968, loss_ce: 0.146263 iteration 5970 : loss : 0.114216, loss_ce: 0.140472 iteration 5980 : loss : 0.100333, loss_ce: 0.121627 iteration 5990 : loss : 0.122838, loss_ce: 0.153538 iteration 6000 : loss : 0.114623, loss_ce: 0.155179 iteration 6010 : loss : 0.113457, loss_ce: 0.136613 iteration 6020 : loss : 0.115566, loss_ce: 0.128887 iteration 6030 : loss : 0.115324, loss_ce: 0.137830 iteration 6040 : loss : 0.115866, loss_ce: 0.141779 iteration 6050 : loss : 0.108289, loss_ce: 0.125446 iteration 6060 : loss : 0.124229, loss_ce: 0.157348 iteration 6070 : loss : 0.097802, loss_ce: 0.105824 iteration 6080 : loss : 0.094934, loss_ce: 0.117631 iteration 6090 : loss : 0.098102, loss_ce: 0.125569 iteration 6100 : loss : 0.096059, loss_ce: 0.110896 iteration 6110 : loss : 0.117914, loss_ce: 0.149168 iteration 6120 : loss : 0.103302, loss_ce: 0.128757 iteration 6130 : loss : 0.103017, loss_ce: 0.128961 iteration 6140 : loss : 0.097018, loss_ce: 0.121900 iteration 6150 : loss : 0.116932, loss_ce: 0.147791 iteration 6160 : loss : 0.120819, loss_ce: 0.153340 iteration 6170 : loss : 0.132531, loss_ce: 0.176021 iteration 6180 : loss : 0.103005, loss_ce: 0.119543 iteration 6190 : loss : 0.100190, loss_ce: 0.122121 iteration 6200 : loss : 0.109029, loss_ce: 0.122575 iteration 6210 : loss : 0.123644, loss_ce: 0.150564 iteration 6220 : loss : 0.102574, loss_ce: 0.124006 iteration 6230 : loss : 0.108485, loss_ce: 0.136350 iteration 6240 : loss : 0.104229, loss_ce: 0.127477 iteration 6250 : loss : 0.181402, loss_ce: 0.243761 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_49_iter_6250.pth iteration 6260 : loss : 0.121899, loss_ce: 0.141202 iteration 6270 : loss : 0.109777, loss_ce: 0.129112 iteration 6280 : loss : 0.105269, loss_ce: 0.129465 iteration 6290 : loss : 0.096844, loss_ce: 0.127559 iteration 6300 : loss : 0.091233, loss_ce: 0.105475 iteration 6310 : loss : 0.098267, loss_ce: 0.117613 iteration 6320 : loss : 0.112006, loss_ce: 0.151657 iteration 6330 : loss : 0.089082, loss_ce: 0.095855 iteration 6340 : loss : 0.122712, loss_ce: 0.154926 iteration 6350 : loss : 0.102726, loss_ce: 0.126547 iteration 6360 : loss : 0.102276, loss_ce: 0.115226 iteration 6370 : loss : 0.098967, loss_ce: 0.122772 iteration 6380 : loss : 0.113151, loss_ce: 0.143865 iteration 6390 : loss : 0.107035, loss_ce: 0.142234 iteration 6400 : loss : 0.121536, loss_ce: 0.156378 iteration 6410 : loss : 0.091311, loss_ce: 0.104370 iteration 6420 : loss : 0.112488, loss_ce: 0.154054 iteration 6430 : loss : 0.100557, loss_ce: 0.118879 iteration 6440 : loss : 0.126136, loss_ce: 0.147122 iteration 6450 : loss : 0.112652, loss_ce: 0.140277 iteration 6460 : loss : 0.098465, loss_ce: 0.111840 iteration 6470 : loss : 0.104193, loss_ce: 0.127842 iteration 6480 : loss : 0.095857, loss_ce: 0.110402 iteration 6490 : loss : 0.111277, loss_ce: 0.138832 iteration 6500 : loss : 0.131065, loss_ce: 0.167776 iteration 6510 : loss : 0.133994, loss_ce: 0.185136 iteration 6520 : loss : 0.092651, loss_ce: 0.103962 iteration 6530 : loss : 0.100070, loss_ce: 0.127644 iteration 6540 : loss : 0.111374, loss_ce: 0.140859 iteration 6550 : loss : 0.128961, loss_ce: 0.167389 iteration 6560 : loss : 0.111599, loss_ce: 0.133198 iteration 6570 : loss : 0.101989, loss_ce: 0.125766 iteration 6580 : loss : 0.116640, loss_ce: 0.143067 iteration 6590 : loss : 0.096685, loss_ce: 0.111073 iteration 6600 : loss : 0.106663, loss_ce: 0.129812 iteration 6610 : loss : 0.118632, loss_ce: 0.149770 iteration 6620 : loss : 0.103968, loss_ce: 0.125205 iteration 6630 : loss : 0.111217, loss_ce: 0.136429 iteration 6640 : loss : 0.096244, loss_ce: 0.117870 iteration 6650 : loss : 0.104962, loss_ce: 0.126086 iteration 6660 : loss : 0.094168, loss_ce: 0.106830 iteration 6670 : loss : 0.101028, loss_ce: 0.113221 iteration 6680 : loss : 0.091290, loss_ce: 0.104332 iteration 6690 : loss : 0.089394, loss_ce: 0.098761 iteration 6700 : loss : 0.097017, loss_ce: 0.113861 iteration 6710 : loss : 0.132045, loss_ce: 0.162465 iteration 6720 : loss : 0.103505, loss_ce: 0.124583 iteration 6730 : loss : 0.130346, loss_ce: 0.166512 iteration 6740 : loss : 0.108160, loss_ce: 0.126331 iteration 6750 : loss : 0.141116, loss_ce: 0.171852 iteration 6760 : loss : 0.101797, loss_ce: 0.122529 iteration 6770 : loss : 0.103538, loss_ce: 0.124701 iteration 6780 : loss : 0.169018, loss_ce: 0.223212 iteration 6790 : loss : 0.168493, loss_ce: 0.221451 iteration 6800 : loss : 0.118151, loss_ce: 0.140979 iteration 6810 : loss : 0.138472, loss_ce: 0.175670 iteration 6820 : loss : 0.120171, loss_ce: 0.156341 iteration 6830 : loss : 0.105443, loss_ce: 0.123632 iteration 6840 : loss : 0.111671, loss_ce: 0.142504 iteration 6850 : loss : 0.122920, loss_ce: 0.149699 iteration 6860 : loss : 0.097091, loss_ce: 0.118146 iteration 6870 : loss : 0.096673, loss_ce: 0.112888 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_54_iter_6875.pth iteration 6880 : loss : 0.113384, loss_ce: 0.144677 iteration 6890 : loss : 0.116062, loss_ce: 0.151867 iteration 6900 : loss : 0.080560, loss_ce: 0.098387 iteration 6910 : loss : 0.133665, loss_ce: 0.170193 iteration 6920 : loss : 0.107589, loss_ce: 0.139691 iteration 6930 : loss : 0.125074, loss_ce: 0.151551 iteration 6940 : loss : 0.098827, loss_ce: 0.119561 iteration 6950 : loss : 0.079655, loss_ce: 0.092727 iteration 6960 : loss : 0.123114, loss_ce: 0.151649 iteration 6970 : loss : 0.106870, loss_ce: 0.125233 iteration 6980 : loss : 0.114672, loss_ce: 0.135970 iteration 6990 : loss : 0.119660, loss_ce: 0.148649 iteration 7000 : loss : 0.148233, loss_ce: 0.213257 iteration 7010 : loss : 0.149570, loss_ce: 0.207463 iteration 7020 : loss : 0.107711, loss_ce: 0.133094 iteration 7030 : loss : 0.095389, loss_ce: 0.119515 iteration 7040 : loss : 0.110773, loss_ce: 0.136251 iteration 7050 : loss : 0.087094, loss_ce: 0.093609 iteration 7060 : loss : 0.115236, loss_ce: 0.146441 iteration 7070 : loss : 0.112309, loss_ce: 0.142184 iteration 7080 : loss : 0.092524, loss_ce: 0.108701 iteration 7090 : loss : 0.099755, loss_ce: 0.125760 iteration 7100 : loss : 0.104730, loss_ce: 0.135007 iteration 7110 : loss : 0.121342, loss_ce: 0.154488 iteration 7120 : loss : 0.119527, loss_ce: 0.155535 iteration 7130 : loss : 0.106550, loss_ce: 0.142982 iteration 7140 : loss : 0.124931, loss_ce: 0.155434 iteration 7150 : loss : 0.079953, loss_ce: 0.097908 iteration 7160 : loss : 0.090501, loss_ce: 0.101244 iteration 7170 : loss : 0.123605, loss_ce: 0.150957 iteration 7180 : loss : 0.107960, loss_ce: 0.139749 iteration 7190 : loss : 0.104904, loss_ce: 0.134307 iteration 7200 : loss : 0.094439, loss_ce: 0.115771 iteration 7210 : loss : 0.125113, loss_ce: 0.155689 iteration 7220 : loss : 0.101494, loss_ce: 0.115538 iteration 7230 : loss : 0.100545, loss_ce: 0.120914 iteration 7240 : loss : 0.111357, loss_ce: 0.144118 iteration 7250 : loss : 0.101963, loss_ce: 0.126287 iteration 7260 : loss : 0.108586, loss_ce: 0.132402 iteration 7270 : loss : 0.094398, loss_ce: 0.115266 iteration 7280 : loss : 0.088693, loss_ce: 0.115847 iteration 7290 : loss : 0.112070, loss_ce: 0.147122 iteration 7300 : loss : 0.115278, loss_ce: 0.139423 iteration 7310 : loss : 0.075520, loss_ce: 0.088969 iteration 7320 : loss : 0.102663, loss_ce: 0.123870 iteration 7330 : loss : 0.126493, loss_ce: 0.157384 iteration 7340 : loss : 0.116765, loss_ce: 0.161091 iteration 7350 : loss : 0.095397, loss_ce: 0.119536 iteration 7360 : loss : 0.098260, loss_ce: 0.097171 iteration 7370 : loss : 0.122811, loss_ce: 0.157212 iteration 7380 : loss : 0.135894, loss_ce: 0.178298 iteration 7390 : loss : 0.124492, loss_ce: 0.157610 iteration 7400 : loss : 0.104384, loss_ce: 0.125637 iteration 7410 : loss : 0.107911, loss_ce: 0.137419 iteration 7420 : loss : 0.091649, loss_ce: 0.113773 iteration 7430 : loss : 0.092718, loss_ce: 0.116996 iteration 7440 : loss : 0.103905, loss_ce: 0.123902 iteration 7450 : loss : 0.102279, loss_ce: 0.125494 iteration 7460 : loss : 0.087753, loss_ce: 0.109192 iteration 7470 : loss : 0.084189, loss_ce: 0.095699 iteration 7480 : loss : 0.112605, loss_ce: 0.145005 iteration 7490 : loss : 0.111185, loss_ce: 0.141826 iteration 7500 : loss : 0.109236, loss_ce: 0.132971 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_59_iter_7500.pth iteration 7510 : loss : 0.094588, loss_ce: 0.116947 iteration 7520 : loss : 0.109587, loss_ce: 0.136489 iteration 7530 : loss : 0.092494, loss_ce: 0.115313 iteration 7540 : loss : 0.110538, loss_ce: 0.146011 iteration 7550 : loss : 0.087469, loss_ce: 0.102880 iteration 7560 : loss : 0.094792, loss_ce: 0.123531 iteration 7570 : loss : 0.090438, loss_ce: 0.108974 iteration 7580 : loss : 0.094824, loss_ce: 0.110255 iteration 7590 : loss : 0.099546, loss_ce: 0.128375 iteration 7600 : loss : 0.099030, loss_ce: 0.122072 iteration 7610 : loss : 0.083518, loss_ce: 0.104195 iteration 7620 : loss : 0.106669, loss_ce: 0.131127 iteration 7630 : loss : 0.109540, loss_ce: 0.129692 iteration 7640 : loss : 0.091337, loss_ce: 0.110181 iteration 7650 : loss : 0.108766, loss_ce: 0.138289 iteration 7660 : loss : 0.123230, loss_ce: 0.158019 iteration 7670 : loss : 0.079997, loss_ce: 0.096661 iteration 7680 : loss : 0.105704, loss_ce: 0.132714 iteration 7690 : loss : 0.083178, loss_ce: 0.103218 iteration 7700 : loss : 0.099347, loss_ce: 0.121944 iteration 7710 : loss : 0.105077, loss_ce: 0.132197 iteration 7720 : loss : 0.096665, loss_ce: 0.096754 iteration 7730 : loss : 0.085659, loss_ce: 0.103650 iteration 7740 : loss : 0.116015, loss_ce: 0.142657 iteration 7750 : loss : 0.172828, loss_ce: 0.225995 iteration 7760 : loss : 0.139269, loss_ce: 0.188169 iteration 7770 : loss : 0.131355, loss_ce: 0.164982 iteration 7780 : loss : 0.094839, loss_ce: 0.117302 iteration 7790 : loss : 0.103572, loss_ce: 0.133076 iteration 7800 : loss : 0.105767, loss_ce: 0.133080 iteration 7810 : loss : 0.086941, loss_ce: 0.112707 iteration 7820 : loss : 0.122630, loss_ce: 0.159203 iteration 7830 : loss : 0.107727, loss_ce: 0.137915 iteration 7840 : loss : 0.116719, loss_ce: 0.150599 iteration 7850 : loss : 0.073005, loss_ce: 0.090394 iteration 7860 : loss : 0.112744, loss_ce: 0.144757 iteration 7870 : loss : 0.094258, loss_ce: 0.104475 iteration 7880 : loss : 0.103541, loss_ce: 0.128465 iteration 7890 : loss : 0.090771, loss_ce: 0.108725 iteration 7900 : loss : 0.119772, loss_ce: 0.154634 iteration 7910 : loss : 0.089007, loss_ce: 0.112470 iteration 7920 : loss : 0.088165, loss_ce: 0.110012 iteration 7930 : loss : 0.139958, loss_ce: 0.185741 iteration 7940 : loss : 0.089105, loss_ce: 0.109800 iteration 7950 : loss : 0.087655, loss_ce: 0.113937 iteration 7960 : loss : 0.087730, loss_ce: 0.110961 iteration 7970 : loss : 0.100344, loss_ce: 0.126301 iteration 7980 : loss : 0.122184, loss_ce: 0.162444 iteration 7990 : loss : 0.100666, loss_ce: 0.123825 iteration 8000 : loss : 0.109982, loss_ce: 0.119499 iteration 8010 : loss : 0.083189, loss_ce: 0.102415 iteration 8020 : loss : 0.114288, loss_ce: 0.141078 iteration 8030 : loss : 0.104659, loss_ce: 0.126169 iteration 8040 : loss : 0.079669, loss_ce: 0.094697 iteration 8050 : loss : 0.092772, loss_ce: 0.119777 iteration 8060 : loss : 0.101634, loss_ce: 0.131218 iteration 8070 : loss : 0.103328, loss_ce: 0.109847 iteration 8080 : loss : 0.092310, loss_ce: 0.115856 iteration 8090 : loss : 0.115102, loss_ce: 0.148404 iteration 8100 : loss : 0.090969, loss_ce: 0.099115 iteration 8110 : loss : 0.103911, loss_ce: 0.134902 iteration 8120 : loss : 0.117180, loss_ce: 0.148632 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_64_iter_8125.pth iteration 8130 : loss : 0.080964, loss_ce: 0.088248 iteration 8140 : loss : 0.112762, loss_ce: 0.149018 iteration 8150 : loss : 0.120082, loss_ce: 0.157644 iteration 8160 : loss : 0.105806, loss_ce: 0.134045 iteration 8170 : loss : 0.093422, loss_ce: 0.105539 iteration 8180 : loss : 0.116185, loss_ce: 0.150314 iteration 8190 : loss : 0.095286, loss_ce: 0.118673 iteration 8200 : loss : 0.115211, loss_ce: 0.142501 iteration 8210 : loss : 0.090720, loss_ce: 0.109714 iteration 8220 : loss : 0.112641, loss_ce: 0.144559 iteration 8230 : loss : 0.090179, loss_ce: 0.116658 iteration 8240 : loss : 0.080772, loss_ce: 0.105777 iteration 8250 : loss : 0.119051, loss_ce: 0.152960 iteration 8260 : loss : 0.091240, loss_ce: 0.111370 iteration 8270 : loss : 0.096706, loss_ce: 0.121861 iteration 8280 : loss : 0.081842, loss_ce: 0.099861 iteration 8290 : loss : 0.099625, loss_ce: 0.131209 iteration 8300 : loss : 0.121039, loss_ce: 0.152191 iteration 8310 : loss : 0.106822, loss_ce: 0.129062 iteration 8320 : loss : 0.102092, loss_ce: 0.132168 iteration 8330 : loss : 0.098322, loss_ce: 0.117199 iteration 8340 : loss : 0.093866, loss_ce: 0.114846 iteration 8350 : loss : 0.109146, loss_ce: 0.142240 iteration 8360 : loss : 0.100511, loss_ce: 0.136222 iteration 8370 : loss : 0.096025, loss_ce: 0.124713 iteration 8380 : loss : 0.100826, loss_ce: 0.127972 iteration 8390 : loss : 0.092026, loss_ce: 0.115987 iteration 8400 : loss : 0.087411, loss_ce: 0.102647 iteration 8410 : loss : 0.107849, loss_ce: 0.137716 iteration 8420 : loss : 0.121310, loss_ce: 0.151644 iteration 8430 : loss : 0.105557, loss_ce: 0.137057 iteration 8440 : loss : 0.085206, loss_ce: 0.099626 iteration 8450 : loss : 0.087438, loss_ce: 0.098594 iteration 8460 : loss : 0.093184, loss_ce: 0.116207 iteration 8470 : loss : 0.095280, loss_ce: 0.125887 iteration 8480 : loss : 0.088152, loss_ce: 0.110099 iteration 8490 : loss : 0.083479, loss_ce: 0.105941 iteration 8500 : loss : 0.097054, loss_ce: 0.116925 iteration 8510 : loss : 0.091394, loss_ce: 0.109526 iteration 8520 : loss : 0.075413, loss_ce: 0.088920 iteration 8530 : loss : 0.090703, loss_ce: 0.108343 iteration 8540 : loss : 0.091654, loss_ce: 0.106796 iteration 8550 : loss : 0.117359, loss_ce: 0.144991 iteration 8560 : loss : 0.101421, loss_ce: 0.127814 iteration 8570 : loss : 0.100370, loss_ce: 0.119994 iteration 8580 : loss : 0.117244, loss_ce: 0.153303 iteration 8590 : loss : 0.092978, loss_ce: 0.104221 iteration 8600 : loss : 0.106631, loss_ce: 0.130475 iteration 8610 : loss : 0.082912, loss_ce: 0.107617 iteration 8620 : loss : 0.103673, loss_ce: 0.127131 iteration 8630 : loss : 0.102962, loss_ce: 0.126368 iteration 8640 : loss : 0.100129, loss_ce: 0.121748 iteration 8650 : loss : 0.102273, loss_ce: 0.124021 iteration 8660 : loss : 0.096249, loss_ce: 0.119971 iteration 8670 : loss : 0.119485, loss_ce: 0.163572 iteration 8680 : loss : 0.089718, loss_ce: 0.107478 iteration 8690 : loss : 0.089358, loss_ce: 0.116660 iteration 8700 : loss : 0.100914, loss_ce: 0.122667 iteration 8710 : loss : 0.108735, loss_ce: 0.136429 iteration 8720 : loss : 0.093353, loss_ce: 0.118121 iteration 8730 : loss : 0.087952, loss_ce: 0.099117 iteration 8740 : loss : 0.083498, loss_ce: 0.094831 iteration 8750 : loss : 0.115596, loss_ce: 0.160274 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_69_iter_8750.pth iteration 8760 : loss : 0.113493, loss_ce: 0.136662 iteration 8770 : loss : 0.107495, loss_ce: 0.142274 iteration 8780 : loss : 0.103945, loss_ce: 0.120607 iteration 8790 : loss : 0.096824, loss_ce: 0.115769 iteration 8800 : loss : 0.089597, loss_ce: 0.111387 iteration 8810 : loss : 0.093859, loss_ce: 0.116652 iteration 8820 : loss : 0.084963, loss_ce: 0.110187 iteration 8830 : loss : 0.100172, loss_ce: 0.130233 iteration 8840 : loss : 0.124687, loss_ce: 0.147957 iteration 8850 : loss : 0.117337, loss_ce: 0.151328 iteration 8860 : loss : 0.081932, loss_ce: 0.096297 iteration 8870 : loss : 0.099927, loss_ce: 0.133484 iteration 8880 : loss : 0.066651, loss_ce: 0.071170 iteration 8890 : loss : 0.099646, loss_ce: 0.122098 iteration 8900 : loss : 0.084914, loss_ce: 0.099420 iteration 8910 : loss : 0.095161, loss_ce: 0.112983 iteration 8920 : loss : 0.112253, loss_ce: 0.147327 iteration 8930 : loss : 0.115041, loss_ce: 0.149789 iteration 8940 : loss : 0.094751, loss_ce: 0.118447 iteration 8950 : loss : 0.092215, loss_ce: 0.111336 iteration 8960 : loss : 0.097922, loss_ce: 0.124098 iteration 8970 : loss : 0.096964, loss_ce: 0.108200 iteration 8980 : loss : 0.092922, loss_ce: 0.115030 iteration 8990 : loss : 0.096163, loss_ce: 0.130193 iteration 9000 : loss : 0.106964, loss_ce: 0.122946 iteration 9010 : loss : 0.091857, loss_ce: 0.109767 iteration 9020 : loss : 0.097149, loss_ce: 0.111609 iteration 9030 : loss : 0.088262, loss_ce: 0.113882 iteration 9040 : loss : 0.091508, loss_ce: 0.116083 iteration 9050 : loss : 0.083786, loss_ce: 0.110882 iteration 9060 : loss : 0.098337, loss_ce: 0.122626 iteration 9070 : loss : 0.081403, loss_ce: 0.096467 iteration 9080 : loss : 0.096616, loss_ce: 0.115759 iteration 9090 : loss : 0.122175, loss_ce: 0.150667 iteration 9100 : loss : 0.122580, loss_ce: 0.157241 iteration 9110 : loss : 0.128639, loss_ce: 0.165801 iteration 9120 : loss : 0.090101, loss_ce: 0.114953 iteration 9130 : loss : 0.096625, loss_ce: 0.122777 iteration 9140 : loss : 0.095314, loss_ce: 0.108226 iteration 9150 : loss : 0.099951, loss_ce: 0.118075 iteration 9160 : loss : 0.095872, loss_ce: 0.123155 iteration 9170 : loss : 0.111912, loss_ce: 0.146943 iteration 9180 : loss : 0.095799, loss_ce: 0.119128 iteration 9190 : loss : 0.099154, loss_ce: 0.118064 iteration 9200 : loss : 0.107732, loss_ce: 0.136699 iteration 9210 : loss : 0.089145, loss_ce: 0.102271 iteration 9220 : loss : 0.107162, loss_ce: 0.144934 iteration 9230 : loss : 0.109776, loss_ce: 0.148663 iteration 9240 : loss : 0.078719, loss_ce: 0.096309 iteration 9250 : loss : 0.113618, loss_ce: 0.118493 iteration 9260 : loss : 0.078539, loss_ce: 0.098756 iteration 9270 : loss : 0.087668, loss_ce: 0.111385 iteration 9280 : loss : 0.092351, loss_ce: 0.112800 iteration 9290 : loss : 0.100473, loss_ce: 0.127318 iteration 9300 : loss : 0.076918, loss_ce: 0.094476 iteration 9310 : loss : 0.103787, loss_ce: 0.126994 iteration 9320 : loss : 0.097488, loss_ce: 0.123969 iteration 9330 : loss : 0.073343, loss_ce: 0.084120 iteration 9340 : loss : 0.098149, loss_ce: 0.130426 iteration 9350 : loss : 0.097694, loss_ce: 0.127120 iteration 9360 : loss : 0.096809, loss_ce: 0.124555 iteration 9370 : loss : 0.090340, loss_ce: 0.114724 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_74_iter_9375.pth iteration 9380 : loss : 0.084535, loss_ce: 0.090348 iteration 9390 : loss : 0.114783, loss_ce: 0.158142 iteration 9400 : loss : 0.078844, loss_ce: 0.095697 iteration 9410 : loss : 0.086576, loss_ce: 0.105024 iteration 9420 : loss : 0.091167, loss_ce: 0.110479 iteration 9430 : loss : 0.096514, loss_ce: 0.114534 iteration 9440 : loss : 0.091468, loss_ce: 0.115046 iteration 9450 : loss : 0.094036, loss_ce: 0.115786 iteration 9460 : loss : 0.079054, loss_ce: 0.085809 iteration 9470 : loss : 0.092858, loss_ce: 0.120227 iteration 9480 : loss : 0.091275, loss_ce: 0.113021 iteration 9490 : loss : 0.064572, loss_ce: 0.070435 iteration 9500 : loss : 0.126525, loss_ce: 0.164170 iteration 9510 : loss : 0.096069, loss_ce: 0.119511 iteration 9520 : loss : 0.104209, loss_ce: 0.135216 iteration 9530 : loss : 0.079202, loss_ce: 0.104681 iteration 9540 : loss : 0.094574, loss_ce: 0.124834 iteration 9550 : loss : 0.085874, loss_ce: 0.090396 iteration 9560 : loss : 0.101668, loss_ce: 0.123416 iteration 9570 : loss : 0.108368, loss_ce: 0.146209 iteration 9580 : loss : 0.098523, loss_ce: 0.118973 iteration 9590 : loss : 0.072356, loss_ce: 0.080220 iteration 9600 : loss : 0.099297, loss_ce: 0.120791 iteration 9610 : loss : 0.075802, loss_ce: 0.094382 iteration 9620 : loss : 0.092353, loss_ce: 0.117794 iteration 9630 : loss : 0.106768, loss_ce: 0.132818 iteration 9640 : loss : 0.086995, loss_ce: 0.108681 iteration 9650 : loss : 0.085090, loss_ce: 0.112582 iteration 9660 : loss : 0.096684, loss_ce: 0.117912 iteration 9670 : loss : 0.113080, loss_ce: 0.148509 iteration 9680 : loss : 0.078385, loss_ce: 0.091127 iteration 9690 : loss : 0.085083, loss_ce: 0.103357 iteration 9700 : loss : 0.095384, loss_ce: 0.117185 iteration 9710 : loss : 0.092297, loss_ce: 0.115023 iteration 9720 : loss : 0.079504, loss_ce: 0.098697 iteration 9730 : loss : 0.112331, loss_ce: 0.141908 iteration 9740 : loss : 0.070392, loss_ce: 0.083258 iteration 9750 : loss : 0.090165, loss_ce: 0.123438 iteration 9760 : loss : 0.088817, loss_ce: 0.115106 iteration 9770 : loss : 0.100869, loss_ce: 0.120560 iteration 9780 : loss : 0.101059, loss_ce: 0.122750 iteration 9790 : loss : 0.078741, loss_ce: 0.094333 iteration 9800 : loss : 0.094628, loss_ce: 0.121713 iteration 9810 : loss : 0.091094, loss_ce: 0.117303 iteration 9820 : loss : 0.078178, loss_ce: 0.076456 iteration 9830 : loss : 0.080980, loss_ce: 0.096718 iteration 9840 : loss : 0.095602, loss_ce: 0.126483 iteration 9850 : loss : 0.078666, loss_ce: 0.099363 iteration 9860 : loss : 0.078782, loss_ce: 0.087971 iteration 9870 : loss : 0.102608, loss_ce: 0.135944 iteration 9880 : loss : 0.076092, loss_ce: 0.091654 iteration 9890 : loss : 0.105541, loss_ce: 0.139269 iteration 9900 : loss : 0.093921, loss_ce: 0.117578 iteration 9910 : loss : 0.088829, loss_ce: 0.099221 iteration 9920 : loss : 0.115608, loss_ce: 0.147254 iteration 9930 : loss : 0.077152, loss_ce: 0.087644 iteration 9940 : loss : 0.081661, loss_ce: 0.096972 iteration 9950 : loss : 0.086811, loss_ce: 0.106862 iteration 9960 : loss : 0.120973, loss_ce: 0.163637 iteration 9970 : loss : 0.082027, loss_ce: 0.092487 iteration 9980 : loss : 0.078730, loss_ce: 0.103294 iteration 9990 : loss : 0.074094, loss_ce: 0.084221 iteration 10000 : loss : 0.062876, loss_ce: 0.083464 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_79_iter_10000.pth iteration 10010 : loss : 0.097630, loss_ce: 0.120639 iteration 10020 : loss : 0.098091, loss_ce: 0.124338 iteration 10030 : loss : 0.075574, loss_ce: 0.090510 iteration 10040 : loss : 0.087639, loss_ce: 0.105862 iteration 10050 : loss : 0.083454, loss_ce: 0.100974 iteration 10060 : loss : 0.080815, loss_ce: 0.096471 iteration 10070 : loss : 0.084715, loss_ce: 0.103475 iteration 10080 : loss : 0.078279, loss_ce: 0.079840 iteration 10090 : loss : 0.082460, loss_ce: 0.095212 iteration 10100 : loss : 0.100137, loss_ce: 0.125295 iteration 10110 : loss : 0.097971, loss_ce: 0.124770 iteration 10120 : loss : 0.117341, loss_ce: 0.159566 iteration 10130 : loss : 0.069816, loss_ce: 0.082422 iteration 10140 : loss : 0.099413, loss_ce: 0.127066 iteration 10150 : loss : 0.086981, loss_ce: 0.100955 iteration 10160 : loss : 0.086797, loss_ce: 0.112914 iteration 10170 : loss : 0.084556, loss_ce: 0.102157 iteration 10180 : loss : 0.082987, loss_ce: 0.104280 iteration 10190 : loss : 0.110369, loss_ce: 0.140087 iteration 10200 : loss : 0.102130, loss_ce: 0.125898 iteration 10210 : loss : 0.096088, loss_ce: 0.112909 iteration 10220 : loss : 0.105262, loss_ce: 0.131818 iteration 10230 : loss : 0.085843, loss_ce: 0.107219 iteration 10240 : loss : 0.082057, loss_ce: 0.097849 iteration 10250 : loss : 0.098409, loss_ce: 0.121208 iteration 10260 : loss : 0.083245, loss_ce: 0.097734 iteration 10270 : loss : 0.094360, loss_ce: 0.121955 iteration 10280 : loss : 0.074336, loss_ce: 0.092272 iteration 10290 : loss : 0.079535, loss_ce: 0.101500 iteration 10300 : loss : 0.089944, loss_ce: 0.105814 iteration 10310 : loss : 0.090200, loss_ce: 0.116979 iteration 10320 : loss : 0.109129, loss_ce: 0.138442 iteration 10330 : loss : 0.095016, loss_ce: 0.114634 iteration 10340 : loss : 0.104925, loss_ce: 0.124371 iteration 10350 : loss : 0.096338, loss_ce: 0.124595 iteration 10360 : loss : 0.095042, loss_ce: 0.123628 iteration 10370 : loss : 0.083595, loss_ce: 0.107781 iteration 10380 : loss : 0.103601, loss_ce: 0.135958 iteration 10390 : loss : 0.083721, loss_ce: 0.108728 iteration 10400 : loss : 0.089104, loss_ce: 0.103338 iteration 10410 : loss : 0.104971, loss_ce: 0.128429 iteration 10420 : loss : 0.087085, loss_ce: 0.104860 iteration 10430 : loss : 0.108133, loss_ce: 0.136051 iteration 10440 : loss : 0.077001, loss_ce: 0.096549 iteration 10450 : loss : 0.080976, loss_ce: 0.089902 iteration 10460 : loss : 0.089394, loss_ce: 0.121811 iteration 10470 : loss : 0.088627, loss_ce: 0.099441 iteration 10480 : loss : 0.080975, loss_ce: 0.089422 iteration 10490 : loss : 0.088444, loss_ce: 0.113619 iteration 10500 : loss : 0.108779, loss_ce: 0.138469 iteration 10510 : loss : 0.091997, loss_ce: 0.111404 iteration 10520 : loss : 0.074586, loss_ce: 0.092828 iteration 10530 : loss : 0.090540, loss_ce: 0.103919 iteration 10540 : loss : 0.097273, loss_ce: 0.124318 iteration 10550 : loss : 0.082164, loss_ce: 0.100188 iteration 10560 : loss : 0.100812, loss_ce: 0.124781 iteration 10570 : loss : 0.096389, loss_ce: 0.115156 iteration 10580 : loss : 0.094088, loss_ce: 0.125667 iteration 10590 : loss : 0.077880, loss_ce: 0.092286 iteration 10600 : loss : 0.100188, loss_ce: 0.135949 iteration 10610 : loss : 0.078762, loss_ce: 0.095084 iteration 10620 : loss : 0.089279, loss_ce: 0.114694 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_84_iter_10625.pth iteration 10630 : loss : 0.090558, loss_ce: 0.118949 iteration 10640 : loss : 0.083269, loss_ce: 0.097521 iteration 10650 : loss : 0.093595, loss_ce: 0.118557 iteration 10660 : loss : 0.100916, loss_ce: 0.126514 iteration 10670 : loss : 0.101990, loss_ce: 0.127288 iteration 10680 : loss : 0.088390, loss_ce: 0.101923 iteration 10690 : loss : 0.093267, loss_ce: 0.117421 iteration 10700 : loss : 0.080359, loss_ce: 0.103764 iteration 10710 : loss : 0.085384, loss_ce: 0.099614 iteration 10720 : loss : 0.111501, loss_ce: 0.151400 iteration 10730 : loss : 0.062878, loss_ce: 0.070411 iteration 10740 : loss : 0.102747, loss_ce: 0.140500 iteration 10750 : loss : 0.079831, loss_ce: 0.092938 iteration 10760 : loss : 0.081986, loss_ce: 0.103160 iteration 10770 : loss : 0.075200, loss_ce: 0.092950 iteration 10780 : loss : 0.072546, loss_ce: 0.084887 iteration 10790 : loss : 0.075801, loss_ce: 0.090824 iteration 10800 : loss : 0.083515, loss_ce: 0.099006 iteration 10810 : loss : 0.086557, loss_ce: 0.110942 iteration 10820 : loss : 0.061667, loss_ce: 0.072950 iteration 10830 : loss : 0.100267, loss_ce: 0.129681 iteration 10840 : loss : 0.096652, loss_ce: 0.125838 iteration 10850 : loss : 0.097570, loss_ce: 0.124328 iteration 10860 : loss : 0.059794, loss_ce: 0.067547 iteration 10870 : loss : 0.074062, loss_ce: 0.094706 iteration 10880 : loss : 0.088684, loss_ce: 0.113127 iteration 10890 : loss : 0.088183, loss_ce: 0.101332 iteration 10900 : loss : 0.089660, loss_ce: 0.114461 iteration 10910 : loss : 0.084607, loss_ce: 0.102701 iteration 10920 : loss : 0.079589, loss_ce: 0.096138 iteration 10930 : loss : 0.093619, loss_ce: 0.117950 iteration 10940 : loss : 0.090973, loss_ce: 0.113806 iteration 10950 : loss : 0.080635, loss_ce: 0.102636 iteration 10960 : loss : 0.090684, loss_ce: 0.119464 iteration 10970 : loss : 0.099132, loss_ce: 0.126942 iteration 10980 : loss : 0.080786, loss_ce: 0.099243 iteration 10990 : loss : 0.087627, loss_ce: 0.115460 iteration 11000 : loss : 0.080671, loss_ce: 0.098646 iteration 11010 : loss : 0.092052, loss_ce: 0.103192 iteration 11020 : loss : 0.080950, loss_ce: 0.094892 iteration 11030 : loss : 0.098442, loss_ce: 0.118407 iteration 11040 : loss : 0.070013, loss_ce: 0.077410 iteration 11050 : loss : 0.091668, loss_ce: 0.119171 iteration 11060 : loss : 0.078992, loss_ce: 0.083712 iteration 11070 : loss : 0.101224, loss_ce: 0.131268 iteration 11080 : loss : 0.123427, loss_ce: 0.166440 iteration 11090 : loss : 0.085541, loss_ce: 0.106291 iteration 11100 : loss : 0.102423, loss_ce: 0.129037 iteration 11110 : loss : 0.094693, loss_ce: 0.113477 iteration 11120 : loss : 0.079940, loss_ce: 0.095083 iteration 11130 : loss : 0.089560, loss_ce: 0.111504 iteration 11140 : loss : 0.074348, loss_ce: 0.086340 iteration 11150 : loss : 0.083142, loss_ce: 0.100072 iteration 11160 : loss : 0.116917, loss_ce: 0.158619 iteration 11170 : loss : 0.077460, loss_ce: 0.087920 iteration 11180 : loss : 0.085271, loss_ce: 0.104645 iteration 11190 : loss : 0.091485, loss_ce: 0.109674 iteration 11200 : loss : 0.095553, loss_ce: 0.115588 iteration 11210 : loss : 0.110137, loss_ce: 0.140220 iteration 11220 : loss : 0.094743, loss_ce: 0.124572 iteration 11230 : loss : 0.067823, loss_ce: 0.070717 iteration 11240 : loss : 0.078809, loss_ce: 0.090788 iteration 11250 : loss : 0.115123, loss_ce: 0.133817 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_89_iter_11250.pth iteration 11260 : loss : 0.076460, loss_ce: 0.092722 iteration 11270 : loss : 0.094492, loss_ce: 0.123951 iteration 11280 : loss : 0.090165, loss_ce: 0.095820 iteration 11290 : loss : 0.089512, loss_ce: 0.112169 iteration 11300 : loss : 0.082005, loss_ce: 0.099461 iteration 11310 : loss : 0.067968, loss_ce: 0.078568 iteration 11320 : loss : 0.098727, loss_ce: 0.123269 iteration 11330 : loss : 0.119768, loss_ce: 0.161744 iteration 11340 : loss : 0.102273, loss_ce: 0.129170 iteration 11350 : loss : 0.094023, loss_ce: 0.113973 iteration 11360 : loss : 0.108741, loss_ce: 0.136654 iteration 11370 : loss : 0.088986, loss_ce: 0.118552 iteration 11380 : loss : 0.104332, loss_ce: 0.137684 iteration 11390 : loss : 0.089134, loss_ce: 0.111551 iteration 11400 : loss : 0.092534, loss_ce: 0.109134 iteration 11410 : loss : 0.090189, loss_ce: 0.114946 iteration 11420 : loss : 0.114774, loss_ce: 0.155045 iteration 11430 : loss : 0.078735, loss_ce: 0.098177 iteration 11440 : loss : 0.079606, loss_ce: 0.099655 iteration 11450 : loss : 0.096378, loss_ce: 0.116580 iteration 11460 : loss : 0.068061, loss_ce: 0.080278 iteration 11470 : loss : 0.110208, loss_ce: 0.135865 iteration 11480 : loss : 0.091372, loss_ce: 0.109655 iteration 11490 : loss : 0.070689, loss_ce: 0.083230 iteration 11500 : loss : 0.089794, loss_ce: 0.112576 iteration 11510 : loss : 0.083085, loss_ce: 0.104881 iteration 11520 : loss : 0.082119, loss_ce: 0.101368 iteration 11530 : loss : 0.075580, loss_ce: 0.098832 iteration 11540 : loss : 0.067691, loss_ce: 0.080514 iteration 11550 : loss : 0.083939, loss_ce: 0.097590 iteration 11560 : loss : 0.108015, loss_ce: 0.143257 iteration 11570 : loss : 0.068689, loss_ce: 0.088916 iteration 11580 : loss : 0.089754, loss_ce: 0.114155 iteration 11590 : loss : 0.102284, loss_ce: 0.137633 iteration 11600 : loss : 0.091974, loss_ce: 0.117400 iteration 11610 : loss : 0.099563, loss_ce: 0.125618 iteration 11620 : loss : 0.099891, loss_ce: 0.131221 iteration 11630 : loss : 0.111611, loss_ce: 0.148027 iteration 11640 : loss : 0.064989, loss_ce: 0.074102 iteration 11650 : loss : 0.082138, loss_ce: 0.102657 iteration 11660 : loss : 0.089160, loss_ce: 0.112520 iteration 11670 : loss : 0.081450, loss_ce: 0.099185 iteration 11680 : loss : 0.077117, loss_ce: 0.101262 iteration 11690 : loss : 0.096224, loss_ce: 0.115307 iteration 11700 : loss : 0.098705, loss_ce: 0.134605 iteration 11710 : loss : 0.092430, loss_ce: 0.120290 iteration 11720 : loss : 0.098579, loss_ce: 0.119621 iteration 11730 : loss : 0.107744, loss_ce: 0.132797 iteration 11740 : loss : 0.084913, loss_ce: 0.109981 iteration 11750 : loss : 0.047774, loss_ce: 0.038551 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_93_iter_11750_loss_0.0386.pth with loss 0.0386 Conditional saves: 1/5 iteration 11760 : loss : 0.082700, loss_ce: 0.100009 iteration 11770 : loss : 0.088281, loss_ce: 0.105867 iteration 11780 : loss : 0.087635, loss_ce: 0.106728 iteration 11790 : loss : 0.075213, loss_ce: 0.092850 iteration 11800 : loss : 0.098807, loss_ce: 0.134969 iteration 11810 : loss : 0.081615, loss_ce: 0.098286 iteration 11820 : loss : 0.081933, loss_ce: 0.100278 iteration 11830 : loss : 0.097511, loss_ce: 0.126642 iteration 11840 : loss : 0.098470, loss_ce: 0.125345 iteration 11850 : loss : 0.079349, loss_ce: 0.106394 iteration 11860 : loss : 0.071679, loss_ce: 0.081905 iteration 11870 : loss : 0.069688, loss_ce: 0.083588 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_94_iter_11875.pth iteration 11880 : loss : 0.088469, loss_ce: 0.117107 iteration 11890 : loss : 0.068057, loss_ce: 0.083024 iteration 11900 : loss : 0.070888, loss_ce: 0.084844 iteration 11910 : loss : 0.090334, loss_ce: 0.120090 iteration 11920 : loss : 0.072408, loss_ce: 0.087756 iteration 11930 : loss : 0.089955, loss_ce: 0.110379 iteration 11940 : loss : 0.086330, loss_ce: 0.111368 iteration 11950 : loss : 0.066208, loss_ce: 0.085121 iteration 11960 : loss : 0.068736, loss_ce: 0.087934 iteration 11970 : loss : 0.090870, loss_ce: 0.122550 iteration 11980 : loss : 0.087185, loss_ce: 0.103385 iteration 11990 : loss : 0.087415, loss_ce: 0.103163 iteration 12000 : loss : 0.065420, loss_ce: 0.074134 iteration 12010 : loss : 0.070713, loss_ce: 0.074345 iteration 12020 : loss : 0.087057, loss_ce: 0.110782 iteration 12030 : loss : 0.073918, loss_ce: 0.093715 iteration 12040 : loss : 0.086427, loss_ce: 0.104631 iteration 12050 : loss : 0.109119, loss_ce: 0.137379 iteration 12060 : loss : 0.106099, loss_ce: 0.145327 iteration 12070 : loss : 0.069988, loss_ce: 0.086772 iteration 12080 : loss : 0.085783, loss_ce: 0.102325 iteration 12090 : loss : 0.091560, loss_ce: 0.118867 iteration 12100 : loss : 0.074647, loss_ce: 0.093215 iteration 12110 : loss : 0.078806, loss_ce: 0.104989 iteration 12120 : loss : 0.077461, loss_ce: 0.095686 iteration 12130 : loss : 0.093177, loss_ce: 0.119625 iteration 12140 : loss : 0.088447, loss_ce: 0.107652 iteration 12150 : loss : 0.065835, loss_ce: 0.078156 iteration 12160 : loss : 0.068652, loss_ce: 0.084298 iteration 12170 : loss : 0.089532, loss_ce: 0.107272 iteration 12180 : loss : 0.074089, loss_ce: 0.091637 iteration 12190 : loss : 0.090211, loss_ce: 0.112051 iteration 12200 : loss : 0.070242, loss_ce: 0.087417 iteration 12210 : loss : 0.083907, loss_ce: 0.105475 iteration 12220 : loss : 0.087965, loss_ce: 0.108537 iteration 12230 : loss : 0.089466, loss_ce: 0.117685 iteration 12240 : loss : 0.083242, loss_ce: 0.104435 iteration 12250 : loss : 0.059809, loss_ce: 0.072841 iteration 12260 : loss : 0.083118, loss_ce: 0.102014 iteration 12270 : loss : 0.113708, loss_ce: 0.153960 iteration 12280 : loss : 0.083514, loss_ce: 0.105209 iteration 12290 : loss : 0.071640, loss_ce: 0.086833 iteration 12300 : loss : 0.078315, loss_ce: 0.103354 iteration 12310 : loss : 0.074823, loss_ce: 0.091586 iteration 12320 : loss : 0.076369, loss_ce: 0.085607 iteration 12330 : loss : 0.083766, loss_ce: 0.106041 iteration 12340 : loss : 0.101057, loss_ce: 0.133599 iteration 12350 : loss : 0.069840, loss_ce: 0.086590 iteration 12360 : loss : 0.062554, loss_ce: 0.070719 iteration 12370 : loss : 0.078965, loss_ce: 0.090023 iteration 12380 : loss : 0.086018, loss_ce: 0.109815 iteration 12390 : loss : 0.080709, loss_ce: 0.092569 iteration 12400 : loss : 0.082216, loss_ce: 0.097621 iteration 12410 : loss : 0.091246, loss_ce: 0.116222 iteration 12420 : loss : 0.081598, loss_ce: 0.094441 iteration 12430 : loss : 0.081870, loss_ce: 0.098607 iteration 12440 : loss : 0.077095, loss_ce: 0.096141 iteration 12450 : loss : 0.076888, loss_ce: 0.093458 iteration 12460 : loss : 0.067264, loss_ce: 0.081370 iteration 12470 : loss : 0.076856, loss_ce: 0.099058 iteration 12480 : loss : 0.075111, loss_ce: 0.088097 iteration 12490 : loss : 0.071864, loss_ce: 0.089658 iteration 12500 : loss : 0.067671, loss_ce: 0.075709 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_99.pth save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_99_iter_12500.pth iteration 12510 : loss : 0.091604, loss_ce: 0.112574 iteration 12520 : loss : 0.080858, loss_ce: 0.097028 iteration 12530 : loss : 0.067642, loss_ce: 0.079754 iteration 12540 : loss : 0.083018, loss_ce: 0.105054 iteration 12550 : loss : 0.064911, loss_ce: 0.069518 iteration 12560 : loss : 0.075083, loss_ce: 0.089789 iteration 12570 : loss : 0.100051, loss_ce: 0.136471 iteration 12580 : loss : 0.088653, loss_ce: 0.114182 iteration 12590 : loss : 0.102143, loss_ce: 0.120336 iteration 12600 : loss : 0.077700, loss_ce: 0.097184 iteration 12610 : loss : 0.077399, loss_ce: 0.089576 iteration 12620 : loss : 0.119657, loss_ce: 0.156532 iteration 12630 : loss : 0.076791, loss_ce: 0.092089 iteration 12640 : loss : 0.080613, loss_ce: 0.103842 iteration 12650 : loss : 0.080024, loss_ce: 0.091324 iteration 12660 : loss : 0.093887, loss_ce: 0.118859 iteration 12670 : loss : 0.088219, loss_ce: 0.108480 iteration 12680 : loss : 0.068083, loss_ce: 0.077909 iteration 12690 : loss : 0.073823, loss_ce: 0.090869 iteration 12700 : loss : 0.100798, loss_ce: 0.128519 iteration 12710 : loss : 0.085383, loss_ce: 0.102419 iteration 12720 : loss : 0.084604, loss_ce: 0.107944 iteration 12730 : loss : 0.087594, loss_ce: 0.108994 iteration 12740 : loss : 0.104040, loss_ce: 0.130716 iteration 12750 : loss : 0.103179, loss_ce: 0.125410 iteration 12760 : loss : 0.110974, loss_ce: 0.142016 iteration 12770 : loss : 0.079097, loss_ce: 0.095289 iteration 12780 : loss : 0.075879, loss_ce: 0.099283 iteration 12790 : loss : 0.087107, loss_ce: 0.098488 iteration 12800 : loss : 0.069887, loss_ce: 0.090767 iteration 12810 : loss : 0.074620, loss_ce: 0.089841 iteration 12820 : loss : 0.084283, loss_ce: 0.113862 iteration 12830 : loss : 0.076974, loss_ce: 0.093685 iteration 12840 : loss : 0.073293, loss_ce: 0.088634 iteration 12850 : loss : 0.089649, loss_ce: 0.116066 iteration 12860 : loss : 0.102183, loss_ce: 0.132431 iteration 12870 : loss : 0.079233, loss_ce: 0.095809 iteration 12880 : loss : 0.091511, loss_ce: 0.112347 iteration 12890 : loss : 0.080753, loss_ce: 0.107345 iteration 12900 : loss : 0.091066, loss_ce: 0.116444 iteration 12910 : loss : 0.083637, loss_ce: 0.109059 iteration 12920 : loss : 0.079213, loss_ce: 0.101717 iteration 12930 : loss : 0.072135, loss_ce: 0.084839 iteration 12940 : loss : 0.077919, loss_ce: 0.097323 iteration 12950 : loss : 0.111424, loss_ce: 0.148691 iteration 12960 : loss : 0.096674, loss_ce: 0.121298 iteration 12970 : loss : 0.079037, loss_ce: 0.099750 iteration 12980 : loss : 0.079982, loss_ce: 0.099294 iteration 12990 : loss : 0.083006, loss_ce: 0.108390 iteration 13000 : loss : 0.112835, loss_ce: 0.148532 iteration 13010 : loss : 0.078526, loss_ce: 0.096007 iteration 13020 : loss : 0.097810, loss_ce: 0.128785 iteration 13030 : loss : 0.073015, loss_ce: 0.092574 iteration 13040 : loss : 0.075884, loss_ce: 0.095822 iteration 13050 : loss : 0.098095, loss_ce: 0.124533 iteration 13060 : loss : 0.091912, loss_ce: 0.114126 iteration 13070 : loss : 0.078360, loss_ce: 0.101500 iteration 13080 : loss : 0.087783, loss_ce: 0.118460 iteration 13090 : loss : 0.079688, loss_ce: 0.092994 iteration 13100 : loss : 0.069065, loss_ce: 0.078173 iteration 13110 : loss : 0.083686, loss_ce: 0.112095 iteration 13120 : loss : 0.077522, loss_ce: 0.097940 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_104_iter_13125.pth iteration 13130 : loss : 0.076080, loss_ce: 0.097451 iteration 13140 : loss : 0.080401, loss_ce: 0.100948 iteration 13150 : loss : 0.079980, loss_ce: 0.095230 iteration 13160 : loss : 0.081332, loss_ce: 0.091319 iteration 13170 : loss : 0.075119, loss_ce: 0.091671 iteration 13180 : loss : 0.075118, loss_ce: 0.080302 iteration 13190 : loss : 0.094199, loss_ce: 0.121107 iteration 13200 : loss : 0.086383, loss_ce: 0.102196 iteration 13210 : loss : 0.077313, loss_ce: 0.101176 iteration 13220 : loss : 0.085954, loss_ce: 0.111914 iteration 13230 : loss : 0.097854, loss_ce: 0.123903 iteration 13240 : loss : 0.080208, loss_ce: 0.080873 iteration 13250 : loss : 0.087243, loss_ce: 0.090591 iteration 13260 : loss : 0.078367, loss_ce: 0.096139 iteration 13270 : loss : 0.076000, loss_ce: 0.100595 iteration 13280 : loss : 0.074087, loss_ce: 0.087193 iteration 13290 : loss : 0.078321, loss_ce: 0.100031 iteration 13300 : loss : 0.106401, loss_ce: 0.143581 iteration 13310 : loss : 0.073916, loss_ce: 0.082604 iteration 13320 : loss : 0.065462, loss_ce: 0.082392 iteration 13330 : loss : 0.083698, loss_ce: 0.112243 iteration 13340 : loss : 0.085339, loss_ce: 0.101729 iteration 13350 : loss : 0.088509, loss_ce: 0.116070 iteration 13360 : loss : 0.073879, loss_ce: 0.092644 iteration 13370 : loss : 0.086599, loss_ce: 0.112790 iteration 13380 : loss : 0.086469, loss_ce: 0.108806 iteration 13390 : loss : 0.085631, loss_ce: 0.116202 iteration 13400 : loss : 0.074630, loss_ce: 0.090911 iteration 13410 : loss : 0.089395, loss_ce: 0.116843 iteration 13420 : loss : 0.078327, loss_ce: 0.101590 iteration 13430 : loss : 0.071905, loss_ce: 0.089282 iteration 13440 : loss : 0.081408, loss_ce: 0.100577 iteration 13450 : loss : 0.095653, loss_ce: 0.127965 iteration 13460 : loss : 0.081476, loss_ce: 0.094355 iteration 13470 : loss : 0.074133, loss_ce: 0.096287 iteration 13480 : loss : 0.090058, loss_ce: 0.110406 iteration 13490 : loss : 0.070669, loss_ce: 0.072582 iteration 13500 : loss : 0.063378, loss_ce: 0.078044 iteration 13510 : loss : 0.066779, loss_ce: 0.081963 iteration 13520 : loss : 0.092675, loss_ce: 0.108896 iteration 13530 : loss : 0.081395, loss_ce: 0.108206 iteration 13540 : loss : 0.072280, loss_ce: 0.090941 iteration 13550 : loss : 0.076538, loss_ce: 0.082750 iteration 13560 : loss : 0.081285, loss_ce: 0.105818 iteration 13570 : loss : 0.072298, loss_ce: 0.085791 iteration 13580 : loss : 0.088708, loss_ce: 0.111470 iteration 13590 : loss : 0.093925, loss_ce: 0.121137 iteration 13600 : loss : 0.080012, loss_ce: 0.104064 iteration 13610 : loss : 0.070577, loss_ce: 0.089196 iteration 13620 : loss : 0.084377, loss_ce: 0.105647 iteration 13630 : loss : 0.086326, loss_ce: 0.112992 iteration 13640 : loss : 0.065167, loss_ce: 0.080592 iteration 13650 : loss : 0.097905, loss_ce: 0.130878 iteration 13660 : loss : 0.068573, loss_ce: 0.076573 iteration 13670 : loss : 0.082859, loss_ce: 0.101440 iteration 13680 : loss : 0.079868, loss_ce: 0.101296 iteration 13690 : loss : 0.069865, loss_ce: 0.090063 iteration 13700 : loss : 0.088701, loss_ce: 0.107368 iteration 13710 : loss : 0.089346, loss_ce: 0.115352 iteration 13720 : loss : 0.100332, loss_ce: 0.138294 iteration 13730 : loss : 0.081328, loss_ce: 0.104212 iteration 13740 : loss : 0.086116, loss_ce: 0.111951 iteration 13750 : loss : 0.072933, loss_ce: 0.091049 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_109_iter_13750.pth iteration 13760 : loss : 0.068574, loss_ce: 0.091691 iteration 13770 : loss : 0.069228, loss_ce: 0.087302 iteration 13780 : loss : 0.072923, loss_ce: 0.093319 iteration 13790 : loss : 0.081341, loss_ce: 0.105163 iteration 13800 : loss : 0.079497, loss_ce: 0.103709 iteration 13810 : loss : 0.103649, loss_ce: 0.140824 iteration 13820 : loss : 0.078947, loss_ce: 0.093161 iteration 13830 : loss : 0.088547, loss_ce: 0.108459 iteration 13840 : loss : 0.075504, loss_ce: 0.093772 iteration 13850 : loss : 0.063011, loss_ce: 0.077518 iteration 13860 : loss : 0.081176, loss_ce: 0.099903 iteration 13870 : loss : 0.072302, loss_ce: 0.089912 iteration 13880 : loss : 0.076601, loss_ce: 0.098308 iteration 13890 : loss : 0.100533, loss_ce: 0.132192 iteration 13900 : loss : 0.072248, loss_ce: 0.087292 iteration 13910 : loss : 0.081968, loss_ce: 0.106867 iteration 13920 : loss : 0.082385, loss_ce: 0.096020 iteration 13930 : loss : 0.095399, loss_ce: 0.118259 iteration 13940 : loss : 0.061127, loss_ce: 0.074411 iteration 13950 : loss : 0.075998, loss_ce: 0.093843 iteration 13960 : loss : 0.088603, loss_ce: 0.118795 iteration 13970 : loss : 0.083428, loss_ce: 0.110330 iteration 13980 : loss : 0.073781, loss_ce: 0.090998 iteration 13990 : loss : 0.082209, loss_ce: 0.109261 iteration 14000 : loss : 0.087364, loss_ce: 0.103561 iteration 14010 : loss : 0.088210, loss_ce: 0.108774 iteration 14020 : loss : 0.079229, loss_ce: 0.095102 iteration 14030 : loss : 0.075347, loss_ce: 0.101464 iteration 14040 : loss : 0.069196, loss_ce: 0.085789 iteration 14050 : loss : 0.084059, loss_ce: 0.108895 iteration 14060 : loss : 0.084066, loss_ce: 0.103917 iteration 14070 : loss : 0.063043, loss_ce: 0.081362 iteration 14080 : loss : 0.075412, loss_ce: 0.093615 iteration 14090 : loss : 0.108843, loss_ce: 0.143167 iteration 14100 : loss : 0.068854, loss_ce: 0.073916 iteration 14110 : loss : 0.073386, loss_ce: 0.091194 iteration 14120 : loss : 0.087586, loss_ce: 0.115004 iteration 14130 : loss : 0.068600, loss_ce: 0.081415 iteration 14140 : loss : 0.065935, loss_ce: 0.074853 iteration 14150 : loss : 0.091127, loss_ce: 0.112461 iteration 14160 : loss : 0.076716, loss_ce: 0.092399 iteration 14170 : loss : 0.080469, loss_ce: 0.109125 iteration 14180 : loss : 0.079411, loss_ce: 0.100979 iteration 14190 : loss : 0.086689, loss_ce: 0.110011 iteration 14200 : loss : 0.085816, loss_ce: 0.100966 iteration 14210 : loss : 0.075964, loss_ce: 0.096756 iteration 14220 : loss : 0.086801, loss_ce: 0.112275 iteration 14230 : loss : 0.081662, loss_ce: 0.091802 iteration 14240 : loss : 0.086711, loss_ce: 0.110569 iteration 14250 : loss : 0.049347, loss_ce: 0.057496 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_113_iter_14250_loss_0.0575.pth with loss 0.0575 Conditional saves: 2/5 iteration 14260 : loss : 0.071971, loss_ce: 0.082486 iteration 14270 : loss : 0.079023, loss_ce: 0.099505 iteration 14280 : loss : 0.082319, loss_ce: 0.101472 iteration 14290 : loss : 0.083469, loss_ce: 0.102463 iteration 14300 : loss : 0.089528, loss_ce: 0.114964 iteration 14310 : loss : 0.082886, loss_ce: 0.112862 iteration 14320 : loss : 0.079596, loss_ce: 0.103932 iteration 14330 : loss : 0.078480, loss_ce: 0.093651 iteration 14340 : loss : 0.072953, loss_ce: 0.089849 iteration 14350 : loss : 0.069189, loss_ce: 0.081788 iteration 14360 : loss : 0.093323, loss_ce: 0.123572 iteration 14370 : loss : 0.073020, loss_ce: 0.086416 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_114_iter_14375.pth iteration 14380 : loss : 0.065717, loss_ce: 0.078660 iteration 14390 : loss : 0.079913, loss_ce: 0.109651 iteration 14400 : loss : 0.066498, loss_ce: 0.080978 iteration 14410 : loss : 0.072695, loss_ce: 0.088148 iteration 14420 : loss : 0.085433, loss_ce: 0.105911 iteration 14430 : loss : 0.104175, loss_ce: 0.136701 iteration 14440 : loss : 0.070047, loss_ce: 0.091244 iteration 14450 : loss : 0.083281, loss_ce: 0.106023 iteration 14460 : loss : 0.101686, loss_ce: 0.131676 iteration 14470 : loss : 0.070958, loss_ce: 0.090802 iteration 14480 : loss : 0.084392, loss_ce: 0.108254 iteration 14490 : loss : 0.077885, loss_ce: 0.096392 iteration 14500 : loss : 0.058583, loss_ce: 0.066584 iteration 14510 : loss : 0.075205, loss_ce: 0.092766 iteration 14520 : loss : 0.068973, loss_ce: 0.088095 iteration 14530 : loss : 0.078476, loss_ce: 0.101653 iteration 14540 : loss : 0.071260, loss_ce: 0.089284 iteration 14550 : loss : 0.100927, loss_ce: 0.132965 iteration 14560 : loss : 0.073963, loss_ce: 0.080122 iteration 14570 : loss : 0.074857, loss_ce: 0.089627 iteration 14580 : loss : 0.084312, loss_ce: 0.107423 iteration 14590 : loss : 0.081705, loss_ce: 0.105102 iteration 14600 : loss : 0.071354, loss_ce: 0.085597 iteration 14610 : loss : 0.082111, loss_ce: 0.106980 iteration 14620 : loss : 0.076504, loss_ce: 0.087664 iteration 14630 : loss : 0.078205, loss_ce: 0.093937 iteration 14640 : loss : 0.072739, loss_ce: 0.086558 iteration 14650 : loss : 0.078821, loss_ce: 0.098204 iteration 14660 : loss : 0.067640, loss_ce: 0.069677 iteration 14670 : loss : 0.074344, loss_ce: 0.094658 iteration 14680 : loss : 0.081211, loss_ce: 0.104493 iteration 14690 : loss : 0.086075, loss_ce: 0.115049 iteration 14700 : loss : 0.086286, loss_ce: 0.109777 iteration 14710 : loss : 0.075885, loss_ce: 0.098839 iteration 14720 : loss : 0.098651, loss_ce: 0.124050 iteration 14730 : loss : 0.057483, loss_ce: 0.067306 iteration 14740 : loss : 0.075317, loss_ce: 0.090686 iteration 14750 : loss : 0.091727, loss_ce: 0.110595 iteration 14760 : loss : 0.121351, loss_ce: 0.161820 iteration 14770 : loss : 0.087814, loss_ce: 0.111421 iteration 14780 : loss : 0.076699, loss_ce: 0.097760 iteration 14790 : loss : 0.083806, loss_ce: 0.097620 iteration 14800 : loss : 0.078041, loss_ce: 0.096485 iteration 14810 : loss : 0.083193, loss_ce: 0.107235 iteration 14820 : loss : 0.121902, loss_ce: 0.162512 iteration 14830 : loss : 0.086422, loss_ce: 0.113207 iteration 14840 : loss : 0.065961, loss_ce: 0.081879 iteration 14850 : loss : 0.080000, loss_ce: 0.096383 iteration 14860 : loss : 0.074028, loss_ce: 0.085303 iteration 14870 : loss : 0.090356, loss_ce: 0.115903 iteration 14880 : loss : 0.080790, loss_ce: 0.107802 iteration 14890 : loss : 0.084036, loss_ce: 0.107512 iteration 14900 : loss : 0.065017, loss_ce: 0.083338 iteration 14910 : loss : 0.081438, loss_ce: 0.104070 iteration 14920 : loss : 0.057423, loss_ce: 0.068592 iteration 14930 : loss : 0.081324, loss_ce: 0.107766 iteration 14940 : loss : 0.079879, loss_ce: 0.099367 iteration 14950 : loss : 0.076155, loss_ce: 0.098126 iteration 14960 : loss : 0.079611, loss_ce: 0.103900 iteration 14970 : loss : 0.082569, loss_ce: 0.108104 iteration 14980 : loss : 0.083727, loss_ce: 0.109008 iteration 14990 : loss : 0.068356, loss_ce: 0.080624 iteration 15000 : loss : 0.063686, loss_ce: 0.080348 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_119_iter_15000.pth iteration 15010 : loss : 0.071395, loss_ce: 0.083239 iteration 15020 : loss : 0.076610, loss_ce: 0.091971 iteration 15030 : loss : 0.078181, loss_ce: 0.109859 iteration 15040 : loss : 0.070185, loss_ce: 0.083243 iteration 15050 : loss : 0.088062, loss_ce: 0.108253 iteration 15060 : loss : 0.079001, loss_ce: 0.105733 iteration 15070 : loss : 0.068750, loss_ce: 0.082379 iteration 15080 : loss : 0.080243, loss_ce: 0.105815 iteration 15090 : loss : 0.101954, loss_ce: 0.124451 iteration 15100 : loss : 0.077393, loss_ce: 0.097110 iteration 15110 : loss : 0.074281, loss_ce: 0.087190 iteration 15120 : loss : 0.067351, loss_ce: 0.081965 iteration 15130 : loss : 0.082168, loss_ce: 0.105844 iteration 15140 : loss : 0.070641, loss_ce: 0.084681 iteration 15150 : loss : 0.084687, loss_ce: 0.112250 iteration 15160 : loss : 0.081784, loss_ce: 0.101141 iteration 15170 : loss : 0.077011, loss_ce: 0.096355 iteration 15180 : loss : 0.086750, loss_ce: 0.113147 iteration 15190 : loss : 0.089542, loss_ce: 0.112384 iteration 15200 : loss : 0.069373, loss_ce: 0.084709 iteration 15210 : loss : 0.063763, loss_ce: 0.075779 iteration 15220 : loss : 0.065997, loss_ce: 0.074044 iteration 15230 : loss : 0.079918, loss_ce: 0.101917 iteration 15240 : loss : 0.081125, loss_ce: 0.103686 iteration 15250 : loss : 0.031178, loss_ce: 0.037524 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_121_iter_15250_loss_0.0375.pth with loss 0.0375 Conditional saves: 3/5 iteration 15260 : loss : 0.063019, loss_ce: 0.070607 iteration 15270 : loss : 0.061046, loss_ce: 0.075588 iteration 15280 : loss : 0.066345, loss_ce: 0.084912 iteration 15290 : loss : 0.069692, loss_ce: 0.086982 iteration 15300 : loss : 0.077010, loss_ce: 0.094507 iteration 15310 : loss : 0.076009, loss_ce: 0.098453 iteration 15320 : loss : 0.062408, loss_ce: 0.073404 iteration 15330 : loss : 0.082134, loss_ce: 0.100568 iteration 15340 : loss : 0.086550, loss_ce: 0.112707 iteration 15350 : loss : 0.083372, loss_ce: 0.099299 iteration 15360 : loss : 0.079605, loss_ce: 0.096730 iteration 15370 : loss : 0.078322, loss_ce: 0.094152 iteration 15380 : loss : 0.071745, loss_ce: 0.085351 iteration 15390 : loss : 0.082261, loss_ce: 0.104789 iteration 15400 : loss : 0.081895, loss_ce: 0.107649 iteration 15410 : loss : 0.073021, loss_ce: 0.095800 iteration 15420 : loss : 0.095408, loss_ce: 0.124449 iteration 15430 : loss : 0.071198, loss_ce: 0.092913 iteration 15440 : loss : 0.063119, loss_ce: 0.071228 iteration 15450 : loss : 0.084144, loss_ce: 0.103679 iteration 15460 : loss : 0.060154, loss_ce: 0.068150 iteration 15470 : loss : 0.075783, loss_ce: 0.095213 iteration 15480 : loss : 0.077137, loss_ce: 0.099912 iteration 15490 : loss : 0.085881, loss_ce: 0.113985 iteration 15500 : loss : 0.077056, loss_ce: 0.090835 iteration 15510 : loss : 0.068085, loss_ce: 0.082264 iteration 15520 : loss : 0.081339, loss_ce: 0.105677 iteration 15530 : loss : 0.072425, loss_ce: 0.093662 iteration 15540 : loss : 0.073568, loss_ce: 0.092835 iteration 15550 : loss : 0.073786, loss_ce: 0.094979 iteration 15560 : loss : 0.054554, loss_ce: 0.063154 iteration 15570 : loss : 0.072107, loss_ce: 0.085207 iteration 15580 : loss : 0.083706, loss_ce: 0.113725 iteration 15590 : loss : 0.066680, loss_ce: 0.083393 iteration 15600 : loss : 0.068632, loss_ce: 0.092501 iteration 15610 : loss : 0.089414, loss_ce: 0.101280 iteration 15620 : loss : 0.097350, loss_ce: 0.129806 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_124_iter_15625.pth iteration 15630 : loss : 0.078277, loss_ce: 0.098937 iteration 15640 : loss : 0.071265, loss_ce: 0.084227 iteration 15650 : loss : 0.080044, loss_ce: 0.098096 iteration 15660 : loss : 0.070135, loss_ce: 0.086053 iteration 15670 : loss : 0.051844, loss_ce: 0.061935 iteration 15680 : loss : 0.063612, loss_ce: 0.080070 iteration 15690 : loss : 0.064795, loss_ce: 0.073956 iteration 15700 : loss : 0.074956, loss_ce: 0.091330 iteration 15710 : loss : 0.079007, loss_ce: 0.098841 iteration 15720 : loss : 0.058363, loss_ce: 0.067923 iteration 15730 : loss : 0.067211, loss_ce: 0.083316 iteration 15740 : loss : 0.070674, loss_ce: 0.083733 iteration 15750 : loss : 0.099530, loss_ce: 0.132252 iteration 15760 : loss : 0.087506, loss_ce: 0.110313 iteration 15770 : loss : 0.075853, loss_ce: 0.093097 iteration 15780 : loss : 0.080986, loss_ce: 0.104827 iteration 15790 : loss : 0.062771, loss_ce: 0.073226 iteration 15800 : loss : 0.072269, loss_ce: 0.090761 iteration 15810 : loss : 0.082629, loss_ce: 0.098851 iteration 15820 : loss : 0.068163, loss_ce: 0.083016 iteration 15830 : loss : 0.065519, loss_ce: 0.079041 iteration 15840 : loss : 0.059862, loss_ce: 0.070780 iteration 15850 : loss : 0.057315, loss_ce: 0.067131 iteration 15860 : loss : 0.069210, loss_ce: 0.084106 iteration 15870 : loss : 0.085413, loss_ce: 0.119379 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_126_iter_15875_loss_0.0559.pth with loss 0.0559 Conditional saves: 4/5 iteration 15880 : loss : 0.064312, loss_ce: 0.077805 iteration 15890 : loss : 0.072307, loss_ce: 0.080689 iteration 15900 : loss : 0.070488, loss_ce: 0.085355 iteration 15910 : loss : 0.077878, loss_ce: 0.099484 iteration 15920 : loss : 0.077321, loss_ce: 0.096044 iteration 15930 : loss : 0.073021, loss_ce: 0.094445 iteration 15940 : loss : 0.075062, loss_ce: 0.094754 iteration 15950 : loss : 0.085101, loss_ce: 0.118026 iteration 15960 : loss : 0.070664, loss_ce: 0.088178 iteration 15970 : loss : 0.079524, loss_ce: 0.104053 iteration 15980 : loss : 0.091565, loss_ce: 0.122311 iteration 15990 : loss : 0.097026, loss_ce: 0.122000 iteration 16000 : loss : 0.062511, loss_ce: 0.081225 iteration 16010 : loss : 0.069463, loss_ce: 0.089466 iteration 16020 : loss : 0.069333, loss_ce: 0.084704 iteration 16030 : loss : 0.076774, loss_ce: 0.095717 iteration 16040 : loss : 0.080158, loss_ce: 0.101848 iteration 16050 : loss : 0.071926, loss_ce: 0.090475 iteration 16060 : loss : 0.101108, loss_ce: 0.133768 iteration 16070 : loss : 0.071527, loss_ce: 0.088847 iteration 16080 : loss : 0.070530, loss_ce: 0.087550 iteration 16090 : loss : 0.081852, loss_ce: 0.101715 iteration 16100 : loss : 0.085267, loss_ce: 0.114147 iteration 16110 : loss : 0.071509, loss_ce: 0.091819 iteration 16120 : loss : 0.079715, loss_ce: 0.099369 iteration 16130 : loss : 0.077900, loss_ce: 0.097268 iteration 16140 : loss : 0.069490, loss_ce: 0.078500 iteration 16150 : loss : 0.085180, loss_ce: 0.111757 iteration 16160 : loss : 0.071789, loss_ce: 0.089646 iteration 16170 : loss : 0.070035, loss_ce: 0.090222 iteration 16180 : loss : 0.077523, loss_ce: 0.095829 iteration 16190 : loss : 0.057844, loss_ce: 0.068635 iteration 16200 : loss : 0.081166, loss_ce: 0.086943 iteration 16210 : loss : 0.064908, loss_ce: 0.063012 iteration 16220 : loss : 0.092815, loss_ce: 0.123291 iteration 16230 : loss : 0.067064, loss_ce: 0.082876 iteration 16240 : loss : 0.076006, loss_ce: 0.095594 iteration 16250 : loss : 0.052363, loss_ce: 0.057333 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_129_iter_16250.pth save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_129_iter_16250_loss_0.0573.pth with loss 0.0573 Conditional saves: 5/5 iteration 16260 : loss : 0.090450, loss_ce: 0.122334 iteration 16270 : loss : 0.076960, loss_ce: 0.097848 iteration 16280 : loss : 0.073675, loss_ce: 0.099683 iteration 16290 : loss : 0.064167, loss_ce: 0.081372 iteration 16300 : loss : 0.079101, loss_ce: 0.098748 iteration 16310 : loss : 0.072562, loss_ce: 0.087485 iteration 16320 : loss : 0.067947, loss_ce: 0.085817 iteration 16330 : loss : 0.073566, loss_ce: 0.092224 iteration 16340 : loss : 0.088282, loss_ce: 0.118619 iteration 16350 : loss : 0.087615, loss_ce: 0.110200 iteration 16360 : loss : 0.059736, loss_ce: 0.065002 iteration 16370 : loss : 0.082859, loss_ce: 0.108371 iteration 16380 : loss : 0.082155, loss_ce: 0.107240 iteration 16390 : loss : 0.074728, loss_ce: 0.092198 iteration 16400 : loss : 0.058619, loss_ce: 0.074204 iteration 16410 : loss : 0.087867, loss_ce: 0.121033 iteration 16420 : loss : 0.056309, loss_ce: 0.067510 iteration 16430 : loss : 0.069594, loss_ce: 0.091124 iteration 16440 : loss : 0.084190, loss_ce: 0.104663 iteration 16450 : loss : 0.064139, loss_ce: 0.077695 iteration 16460 : loss : 0.069817, loss_ce: 0.084187 iteration 16470 : loss : 0.068347, loss_ce: 0.082028 iteration 16480 : loss : 0.079070, loss_ce: 0.101195 iteration 16490 : loss : 0.088625, loss_ce: 0.119256 iteration 16500 : loss : 0.088176, loss_ce: 0.090239 iteration 16510 : loss : 0.090351, loss_ce: 0.109413 iteration 16520 : loss : 0.079274, loss_ce: 0.102035 iteration 16530 : loss : 0.076083, loss_ce: 0.084852 iteration 16540 : loss : 0.061296, loss_ce: 0.074215 iteration 16550 : loss : 0.081309, loss_ce: 0.111031 iteration 16560 : loss : 0.072965, loss_ce: 0.088756 iteration 16570 : loss : 0.066555, loss_ce: 0.091359 iteration 16580 : loss : 0.072205, loss_ce: 0.093041 iteration 16590 : loss : 0.065513, loss_ce: 0.078209 iteration 16600 : loss : 0.085411, loss_ce: 0.102692 iteration 16610 : loss : 0.064296, loss_ce: 0.079511 iteration 16620 : loss : 0.067553, loss_ce: 0.078948 iteration 16630 : loss : 0.076920, loss_ce: 0.102552 iteration 16640 : loss : 0.070780, loss_ce: 0.088133 iteration 16650 : loss : 0.074211, loss_ce: 0.092237 iteration 16660 : loss : 0.107039, loss_ce: 0.142382 iteration 16670 : loss : 0.081193, loss_ce: 0.100365 iteration 16680 : loss : 0.057706, loss_ce: 0.067899 iteration 16690 : loss : 0.080431, loss_ce: 0.103781 iteration 16700 : loss : 0.072207, loss_ce: 0.085820 iteration 16710 : loss : 0.055457, loss_ce: 0.063716 iteration 16720 : loss : 0.059778, loss_ce: 0.073245 iteration 16730 : loss : 0.077537, loss_ce: 0.103537 iteration 16740 : loss : 0.073042, loss_ce: 0.095468 iteration 16750 : loss : 0.093749, loss_ce: 0.120619 iteration 16760 : loss : 0.071946, loss_ce: 0.087619 iteration 16770 : loss : 0.061884, loss_ce: 0.078259 iteration 16780 : loss : 0.073942, loss_ce: 0.091815 iteration 16790 : loss : 0.079029, loss_ce: 0.098516 iteration 16800 : loss : 0.062238, loss_ce: 0.074243 iteration 16810 : loss : 0.083456, loss_ce: 0.108164 iteration 16820 : loss : 0.085188, loss_ce: 0.117432 iteration 16830 : loss : 0.065460, loss_ce: 0.081377 iteration 16840 : loss : 0.079232, loss_ce: 0.099849 iteration 16850 : loss : 0.076382, loss_ce: 0.100472 iteration 16860 : loss : 0.072158, loss_ce: 0.087083 iteration 16870 : loss : 0.070404, loss_ce: 0.091605 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_134_iter_16875.pth iteration 16880 : loss : 0.073596, loss_ce: 0.096028 iteration 16890 : loss : 0.081805, loss_ce: 0.107547 iteration 16900 : loss : 0.059094, loss_ce: 0.068931 iteration 16910 : loss : 0.056155, loss_ce: 0.068726 iteration 16920 : loss : 0.066927, loss_ce: 0.081238 iteration 16930 : loss : 0.060794, loss_ce: 0.073375 iteration 16940 : loss : 0.069608, loss_ce: 0.088202 iteration 16950 : loss : 0.068802, loss_ce: 0.085903 iteration 16960 : loss : 0.076564, loss_ce: 0.097135 iteration 16970 : loss : 0.080035, loss_ce: 0.104717 iteration 16980 : loss : 0.089211, loss_ce: 0.114922 iteration 16990 : loss : 0.064368, loss_ce: 0.080654 iteration 17000 : loss : 0.109816, loss_ce: 0.138232 iteration 17010 : loss : 0.086727, loss_ce: 0.109704 iteration 17020 : loss : 0.074904, loss_ce: 0.085000 iteration 17030 : loss : 0.090119, loss_ce: 0.115021 iteration 17040 : loss : 0.070366, loss_ce: 0.084179 iteration 17050 : loss : 0.090135, loss_ce: 0.116310 iteration 17060 : loss : 0.070016, loss_ce: 0.077740 iteration 17070 : loss : 0.076817, loss_ce: 0.097838 iteration 17080 : loss : 0.069924, loss_ce: 0.082485 iteration 17090 : loss : 0.061130, loss_ce: 0.083144 iteration 17100 : loss : 0.062172, loss_ce: 0.077487 iteration 17110 : loss : 0.070105, loss_ce: 0.090639 iteration 17120 : loss : 0.092259, loss_ce: 0.123223 iteration 17130 : loss : 0.068650, loss_ce: 0.081088 iteration 17140 : loss : 0.071598, loss_ce: 0.088545 iteration 17150 : loss : 0.056652, loss_ce: 0.071015 iteration 17160 : loss : 0.061712, loss_ce: 0.069552 iteration 17170 : loss : 0.074872, loss_ce: 0.098796 iteration 17180 : loss : 0.070148, loss_ce: 0.094443 iteration 17190 : loss : 0.069050, loss_ce: 0.086412 iteration 17200 : loss : 0.076936, loss_ce: 0.091260 iteration 17210 : loss : 0.083314, loss_ce: 0.106888 iteration 17220 : loss : 0.071927, loss_ce: 0.088480 iteration 17230 : loss : 0.072796, loss_ce: 0.089065 iteration 17240 : loss : 0.079566, loss_ce: 0.098558 iteration 17250 : loss : 0.052436, loss_ce: 0.039307 iteration 17260 : loss : 0.077574, loss_ce: 0.098248 iteration 17270 : loss : 0.068708, loss_ce: 0.090208 iteration 17280 : loss : 0.065245, loss_ce: 0.082967 iteration 17290 : loss : 0.070799, loss_ce: 0.088387 iteration 17300 : loss : 0.068121, loss_ce: 0.069396 iteration 17310 : loss : 0.081893, loss_ce: 0.109019 iteration 17320 : loss : 0.066271, loss_ce: 0.079089 iteration 17330 : loss : 0.092328, loss_ce: 0.129818 iteration 17340 : loss : 0.082630, loss_ce: 0.110019 iteration 17350 : loss : 0.059502, loss_ce: 0.075682 iteration 17360 : loss : 0.066877, loss_ce: 0.085431 iteration 17370 : loss : 0.076240, loss_ce: 0.091963 iteration 17380 : loss : 0.074108, loss_ce: 0.096293 iteration 17390 : loss : 0.062587, loss_ce: 0.069185 iteration 17400 : loss : 0.077814, loss_ce: 0.096099 iteration 17410 : loss : 0.078501, loss_ce: 0.103322 iteration 17420 : loss : 0.061974, loss_ce: 0.071042 iteration 17430 : loss : 0.050524, loss_ce: 0.062221 iteration 17440 : loss : 0.074466, loss_ce: 0.092959 iteration 17450 : loss : 0.083704, loss_ce: 0.109258 iteration 17460 : loss : 0.068445, loss_ce: 0.089581 iteration 17470 : loss : 0.075008, loss_ce: 0.092849 iteration 17480 : loss : 0.065318, loss_ce: 0.081877 iteration 17490 : loss : 0.070932, loss_ce: 0.087987 iteration 17500 : loss : 0.088033, loss_ce: 0.128012 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_139_iter_17500.pth iteration 17510 : loss : 0.061275, loss_ce: 0.073368 iteration 17520 : loss : 0.082207, loss_ce: 0.102263 iteration 17530 : loss : 0.074686, loss_ce: 0.092685 iteration 17540 : loss : 0.051889, loss_ce: 0.057575 iteration 17550 : loss : 0.056678, loss_ce: 0.060270 iteration 17560 : loss : 0.065844, loss_ce: 0.080607 iteration 17570 : loss : 0.067721, loss_ce: 0.081643 iteration 17580 : loss : 0.068742, loss_ce: 0.085663 iteration 17590 : loss : 0.072593, loss_ce: 0.098437 iteration 17600 : loss : 0.069765, loss_ce: 0.090485 iteration 17610 : loss : 0.068774, loss_ce: 0.078169 iteration 17620 : loss : 0.066837, loss_ce: 0.085241 iteration 17630 : loss : 0.078926, loss_ce: 0.093099 iteration 17640 : loss : 0.073977, loss_ce: 0.093642 iteration 17650 : loss : 0.065057, loss_ce: 0.079938 iteration 17660 : loss : 0.085342, loss_ce: 0.111269 iteration 17670 : loss : 0.073940, loss_ce: 0.094802 iteration 17680 : loss : 0.075529, loss_ce: 0.097296 iteration 17690 : loss : 0.082665, loss_ce: 0.106524 iteration 17700 : loss : 0.066663, loss_ce: 0.082415 iteration 17710 : loss : 0.078109, loss_ce: 0.096835 iteration 17720 : loss : 0.067800, loss_ce: 0.075529 iteration 17730 : loss : 0.067006, loss_ce: 0.087943 iteration 17740 : loss : 0.075980, loss_ce: 0.093012 iteration 17750 : loss : 0.090976, loss_ce: 0.125770 iteration 17760 : loss : 0.059808, loss_ce: 0.076045 iteration 17770 : loss : 0.075068, loss_ce: 0.100490 iteration 17780 : loss : 0.078525, loss_ce: 0.088948 iteration 17790 : loss : 0.080902, loss_ce: 0.108267 iteration 17800 : loss : 0.063201, loss_ce: 0.064739 iteration 17810 : loss : 0.063335, loss_ce: 0.082478 iteration 17820 : loss : 0.073552, loss_ce: 0.092056 iteration 17830 : loss : 0.059933, loss_ce: 0.075662 iteration 17840 : loss : 0.064821, loss_ce: 0.082132 iteration 17850 : loss : 0.069778, loss_ce: 0.085205 iteration 17860 : loss : 0.065265, loss_ce: 0.080640 iteration 17870 : loss : 0.070419, loss_ce: 0.091151 iteration 17880 : loss : 0.068460, loss_ce: 0.077075 iteration 17890 : loss : 0.077533, loss_ce: 0.100988 iteration 17900 : loss : 0.073172, loss_ce: 0.095428 iteration 17910 : loss : 0.057906, loss_ce: 0.067093 iteration 17920 : loss : 0.067010, loss_ce: 0.083763 iteration 17930 : loss : 0.053973, loss_ce: 0.062987 iteration 17940 : loss : 0.081630, loss_ce: 0.099212 iteration 17950 : loss : 0.072038, loss_ce: 0.100085 iteration 17960 : loss : 0.067276, loss_ce: 0.081310 iteration 17970 : loss : 0.086730, loss_ce: 0.111070 iteration 17980 : loss : 0.066937, loss_ce: 0.080196 iteration 17990 : loss : 0.073193, loss_ce: 0.093418 iteration 18000 : loss : 0.077128, loss_ce: 0.093686 iteration 18010 : loss : 0.067216, loss_ce: 0.082210 iteration 18020 : loss : 0.074484, loss_ce: 0.087691 iteration 18030 : loss : 0.066841, loss_ce: 0.085564 iteration 18040 : loss : 0.056926, loss_ce: 0.071185 iteration 18050 : loss : 0.086015, loss_ce: 0.109619 iteration 18060 : loss : 0.071004, loss_ce: 0.087872 iteration 18070 : loss : 0.073192, loss_ce: 0.092503 iteration 18080 : loss : 0.061915, loss_ce: 0.081268 iteration 18090 : loss : 0.075260, loss_ce: 0.094925 iteration 18100 : loss : 0.081010, loss_ce: 0.098593 iteration 18110 : loss : 0.082627, loss_ce: 0.110867 iteration 18120 : loss : 0.067094, loss_ce: 0.082856 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_144_iter_18125.pth iteration 18130 : loss : 0.057568, loss_ce: 0.070734 iteration 18140 : loss : 0.087372, loss_ce: 0.113266 iteration 18150 : loss : 0.073205, loss_ce: 0.085254 iteration 18160 : loss : 0.063643, loss_ce: 0.083379 iteration 18170 : loss : 0.072780, loss_ce: 0.088020 iteration 18180 : loss : 0.083930, loss_ce: 0.107377 iteration 18190 : loss : 0.081937, loss_ce: 0.108388 iteration 18200 : loss : 0.049321, loss_ce: 0.059557 iteration 18210 : loss : 0.068320, loss_ce: 0.087649 iteration 18220 : loss : 0.073436, loss_ce: 0.096868 iteration 18230 : loss : 0.071056, loss_ce: 0.088637 iteration 18240 : loss : 0.055765, loss_ce: 0.066813 iteration 18250 : loss : 0.062341, loss_ce: 0.083538 iteration 18260 : loss : 0.052646, loss_ce: 0.063516 iteration 18270 : loss : 0.075790, loss_ce: 0.095716 iteration 18280 : loss : 0.077207, loss_ce: 0.103652 iteration 18290 : loss : 0.077014, loss_ce: 0.100745 iteration 18300 : loss : 0.075514, loss_ce: 0.091819 iteration 18310 : loss : 0.055106, loss_ce: 0.066464 iteration 18320 : loss : 0.059974, loss_ce: 0.078614 iteration 18330 : loss : 0.072038, loss_ce: 0.097706 iteration 18340 : loss : 0.065826, loss_ce: 0.077556 iteration 18350 : loss : 0.081280, loss_ce: 0.104630 iteration 18360 : loss : 0.060467, loss_ce: 0.071894 iteration 18370 : loss : 0.067117, loss_ce: 0.084098 iteration 18380 : loss : 0.059932, loss_ce: 0.064242 iteration 18390 : loss : 0.069516, loss_ce: 0.093994 iteration 18400 : loss : 0.070155, loss_ce: 0.095981 iteration 18410 : loss : 0.055897, loss_ce: 0.066527 iteration 18420 : loss : 0.075125, loss_ce: 0.102129 iteration 18430 : loss : 0.070519, loss_ce: 0.083635 iteration 18440 : loss : 0.061107, loss_ce: 0.079673 iteration 18450 : loss : 0.076921, loss_ce: 0.103213 iteration 18460 : loss : 0.053101, loss_ce: 0.062945 iteration 18470 : loss : 0.062162, loss_ce: 0.079891 iteration 18480 : loss : 0.072605, loss_ce: 0.090285 iteration 18490 : loss : 0.072464, loss_ce: 0.099212 iteration 18500 : loss : 0.103778, loss_ce: 0.133445 iteration 18510 : loss : 0.072918, loss_ce: 0.092632 iteration 18520 : loss : 0.067476, loss_ce: 0.080748 iteration 18530 : loss : 0.062092, loss_ce: 0.080686 iteration 18540 : loss : 0.060276, loss_ce: 0.074624 iteration 18550 : loss : 0.075890, loss_ce: 0.100996 iteration 18560 : loss : 0.084472, loss_ce: 0.113229 iteration 18570 : loss : 0.066343, loss_ce: 0.077929 iteration 18580 : loss : 0.075254, loss_ce: 0.092104 iteration 18590 : loss : 0.060444, loss_ce: 0.076675 iteration 18600 : loss : 0.075709, loss_ce: 0.099463 iteration 18610 : loss : 0.079288, loss_ce: 0.098491 iteration 18620 : loss : 0.070810, loss_ce: 0.090997 iteration 18630 : loss : 0.082591, loss_ce: 0.100572 iteration 18640 : loss : 0.072405, loss_ce: 0.084022 iteration 18650 : loss : 0.062393, loss_ce: 0.079416 iteration 18660 : loss : 0.070880, loss_ce: 0.081503 iteration 18670 : loss : 0.084575, loss_ce: 0.106945 iteration 18680 : loss : 0.070901, loss_ce: 0.086827 iteration 18690 : loss : 0.085327, loss_ce: 0.112948 iteration 18700 : loss : 0.065200, loss_ce: 0.083298 iteration 18710 : loss : 0.069460, loss_ce: 0.079064 iteration 18720 : loss : 0.070064, loss_ce: 0.087386 iteration 18730 : loss : 0.070062, loss_ce: 0.083945 iteration 18740 : loss : 0.074662, loss_ce: 0.098803 iteration 18750 : loss : 0.096832, loss_ce: 0.119541 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_149.pth save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_149_iter_18750.pth iteration 18760 : loss : 0.068129, loss_ce: 0.085514 iteration 18770 : loss : 0.050984, loss_ce: 0.061910 iteration 18780 : loss : 0.071037, loss_ce: 0.084785 iteration 18790 : loss : 0.065936, loss_ce: 0.083858 iteration 18800 : loss : 0.071484, loss_ce: 0.092055 iteration 18810 : loss : 0.059845, loss_ce: 0.070689 iteration 18820 : loss : 0.064815, loss_ce: 0.080299 iteration 18830 : loss : 0.066495, loss_ce: 0.078880 iteration 18840 : loss : 0.063126, loss_ce: 0.074399 iteration 18850 : loss : 0.061439, loss_ce: 0.074276 iteration 18860 : loss : 0.054689, loss_ce: 0.067283 iteration 18870 : loss : 0.077571, loss_ce: 0.098507 iteration 18880 : loss : 0.080436, loss_ce: 0.107022 iteration 18890 : loss : 0.053689, loss_ce: 0.065210 iteration 18900 : loss : 0.056479, loss_ce: 0.069042 iteration 18910 : loss : 0.067160, loss_ce: 0.081718 iteration 18920 : loss : 0.070377, loss_ce: 0.093227 iteration 18930 : loss : 0.073815, loss_ce: 0.095234 iteration 18940 : loss : 0.058963, loss_ce: 0.072972 iteration 18950 : loss : 0.063904, loss_ce: 0.077143 iteration 18960 : loss : 0.073372, loss_ce: 0.091482 iteration 18970 : loss : 0.072712, loss_ce: 0.092323 iteration 18980 : loss : 0.080125, loss_ce: 0.108567 iteration 18990 : loss : 0.056239, loss_ce: 0.070316 iteration 19000 : loss : 0.083495, loss_ce: 0.098213 iteration 19010 : loss : 0.077150, loss_ce: 0.092991 iteration 19020 : loss : 0.060216, loss_ce: 0.072928 iteration 19030 : loss : 0.062239, loss_ce: 0.077141 iteration 19040 : loss : 0.069006, loss_ce: 0.084731 iteration 19050 : loss : 0.068555, loss_ce: 0.082852 iteration 19060 : loss : 0.057750, loss_ce: 0.075881 iteration 19070 : loss : 0.063434, loss_ce: 0.077367 iteration 19080 : loss : 0.076890, loss_ce: 0.090438 iteration 19090 : loss : 0.056498, loss_ce: 0.069664 iteration 19100 : loss : 0.076456, loss_ce: 0.095655 iteration 19110 : loss : 0.079047, loss_ce: 0.100211 iteration 19120 : loss : 0.074329, loss_ce: 0.090205 iteration 19130 : loss : 0.072447, loss_ce: 0.085971 iteration 19140 : loss : 0.087965, loss_ce: 0.110437 iteration 19150 : loss : 0.082535, loss_ce: 0.110188 iteration 19160 : loss : 0.065618, loss_ce: 0.083311 iteration 19170 : loss : 0.085498, loss_ce: 0.118810 iteration 19180 : loss : 0.078001, loss_ce: 0.096138 iteration 19190 : loss : 0.072945, loss_ce: 0.091835 iteration 19200 : loss : 0.059610, loss_ce: 0.069917 iteration 19210 : loss : 0.054849, loss_ce: 0.062528 iteration 19220 : loss : 0.075271, loss_ce: 0.100347 iteration 19230 : loss : 0.065572, loss_ce: 0.081263 iteration 19240 : loss : 0.071268, loss_ce: 0.089563 iteration 19250 : loss : 0.066762, loss_ce: 0.085614 iteration 19260 : loss : 0.064192, loss_ce: 0.078806 iteration 19270 : loss : 0.071259, loss_ce: 0.095355 iteration 19280 : loss : 0.064458, loss_ce: 0.085886 iteration 19290 : loss : 0.061581, loss_ce: 0.075555 iteration 19300 : loss : 0.068306, loss_ce: 0.080239 iteration 19310 : loss : 0.064309, loss_ce: 0.078418 iteration 19320 : loss : 0.069291, loss_ce: 0.090994 iteration 19330 : loss : 0.064002, loss_ce: 0.076905 iteration 19340 : loss : 0.073445, loss_ce: 0.096074 iteration 19350 : loss : 0.051352, loss_ce: 0.058913 iteration 19360 : loss : 0.058116, loss_ce: 0.067710 iteration 19370 : loss : 0.068917, loss_ce: 0.083837 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_154_iter_19375.pth iteration 19380 : loss : 0.062163, loss_ce: 0.071150 iteration 19390 : loss : 0.069010, loss_ce: 0.087209 iteration 19400 : loss : 0.056859, loss_ce: 0.070597 iteration 19410 : loss : 0.070341, loss_ce: 0.085768 iteration 19420 : loss : 0.079729, loss_ce: 0.105277 iteration 19430 : loss : 0.086181, loss_ce: 0.116018 iteration 19440 : loss : 0.060559, loss_ce: 0.079838 iteration 19450 : loss : 0.074574, loss_ce: 0.096890 iteration 19460 : loss : 0.061727, loss_ce: 0.070469 iteration 19470 : loss : 0.070108, loss_ce: 0.083801 iteration 19480 : loss : 0.058799, loss_ce: 0.078305 iteration 19490 : loss : 0.065594, loss_ce: 0.084418 iteration 19500 : loss : 0.062577, loss_ce: 0.088711 iteration 19510 : loss : 0.061900, loss_ce: 0.080546 iteration 19520 : loss : 0.063638, loss_ce: 0.082565 iteration 19530 : loss : 0.080073, loss_ce: 0.094374 iteration 19540 : loss : 0.055209, loss_ce: 0.062236 iteration 19550 : loss : 0.063245, loss_ce: 0.077610 iteration 19560 : loss : 0.083040, loss_ce: 0.104924 iteration 19570 : loss : 0.061848, loss_ce: 0.074787 iteration 19580 : loss : 0.060742, loss_ce: 0.071693 iteration 19590 : loss : 0.073067, loss_ce: 0.087430 iteration 19600 : loss : 0.077654, loss_ce: 0.096969 iteration 19610 : loss : 0.080137, loss_ce: 0.108063 iteration 19620 : loss : 0.072147, loss_ce: 0.091257 iteration 19630 : loss : 0.068015, loss_ce: 0.088137 iteration 19640 : loss : 0.076157, loss_ce: 0.090851 iteration 19650 : loss : 0.079044, loss_ce: 0.104489 iteration 19660 : loss : 0.065704, loss_ce: 0.083660 iteration 19670 : loss : 0.076135, loss_ce: 0.098272 iteration 19680 : loss : 0.067180, loss_ce: 0.088338 iteration 19690 : loss : 0.055994, loss_ce: 0.072475 iteration 19700 : loss : 0.066210, loss_ce: 0.083109 iteration 19710 : loss : 0.057633, loss_ce: 0.073981 iteration 19720 : loss : 0.067612, loss_ce: 0.079061 iteration 19730 : loss : 0.063612, loss_ce: 0.079178 iteration 19740 : loss : 0.064773, loss_ce: 0.081985 iteration 19750 : loss : 0.062970, loss_ce: 0.070031 iteration 19760 : loss : 0.053691, loss_ce: 0.059900 iteration 19770 : loss : 0.065418, loss_ce: 0.081395 iteration 19780 : loss : 0.068966, loss_ce: 0.086304 iteration 19790 : loss : 0.057620, loss_ce: 0.065079 iteration 19800 : loss : 0.064653, loss_ce: 0.082706 iteration 19810 : loss : 0.072595, loss_ce: 0.093447 iteration 19820 : loss : 0.070537, loss_ce: 0.091351 iteration 19830 : loss : 0.064536, loss_ce: 0.079054 iteration 19840 : loss : 0.073650, loss_ce: 0.094697 iteration 19850 : loss : 0.065046, loss_ce: 0.083095 iteration 19860 : loss : 0.074461, loss_ce: 0.100902 iteration 19870 : loss : 0.061646, loss_ce: 0.078005 iteration 19880 : loss : 0.076919, loss_ce: 0.100615 iteration 19890 : loss : 0.077781, loss_ce: 0.096074 iteration 19900 : loss : 0.089498, loss_ce: 0.116847 iteration 19910 : loss : 0.070751, loss_ce: 0.088603 iteration 19920 : loss : 0.066670, loss_ce: 0.082883 iteration 19930 : loss : 0.061800, loss_ce: 0.074670 iteration 19940 : loss : 0.067870, loss_ce: 0.081626 iteration 19950 : loss : 0.062888, loss_ce: 0.079357 iteration 19960 : loss : 0.063010, loss_ce: 0.069430 iteration 19970 : loss : 0.060573, loss_ce: 0.074275 iteration 19980 : loss : 0.072484, loss_ce: 0.099134 iteration 19990 : loss : 0.079856, loss_ce: 0.103286 iteration 20000 : loss : 0.056107, loss_ce: 0.073853 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_159_iter_20000.pth iteration 20010 : loss : 0.065374, loss_ce: 0.075788 iteration 20020 : loss : 0.063074, loss_ce: 0.076889 iteration 20030 : loss : 0.092967, loss_ce: 0.121855 iteration 20040 : loss : 0.077570, loss_ce: 0.087814 iteration 20050 : loss : 0.085233, loss_ce: 0.111858 iteration 20060 : loss : 0.071642, loss_ce: 0.090151 iteration 20070 : loss : 0.093052, loss_ce: 0.112554 iteration 20080 : loss : 0.065149, loss_ce: 0.079682 iteration 20090 : loss : 0.056097, loss_ce: 0.071072 iteration 20100 : loss : 0.071787, loss_ce: 0.095017 iteration 20110 : loss : 0.048791, loss_ce: 0.056274 iteration 20120 : loss : 0.084815, loss_ce: 0.105779 iteration 20130 : loss : 0.064240, loss_ce: 0.082266 iteration 20140 : loss : 0.076964, loss_ce: 0.098887 iteration 20150 : loss : 0.067684, loss_ce: 0.085239 iteration 20160 : loss : 0.070300, loss_ce: 0.089954 iteration 20170 : loss : 0.070204, loss_ce: 0.085053 iteration 20180 : loss : 0.074266, loss_ce: 0.090330 iteration 20190 : loss : 0.078669, loss_ce: 0.102494 iteration 20200 : loss : 0.087342, loss_ce: 0.118634 iteration 20210 : loss : 0.077554, loss_ce: 0.102796 iteration 20220 : loss : 0.076877, loss_ce: 0.096903 iteration 20230 : loss : 0.056277, loss_ce: 0.067311 iteration 20240 : loss : 0.056666, loss_ce: 0.067782 iteration 20250 : loss : 0.063112, loss_ce: 0.080190 iteration 20260 : loss : 0.080788, loss_ce: 0.107148 iteration 20270 : loss : 0.060504, loss_ce: 0.073667 iteration 20280 : loss : 0.066996, loss_ce: 0.086011 iteration 20290 : loss : 0.066052, loss_ce: 0.086925 iteration 20300 : loss : 0.067393, loss_ce: 0.091689 iteration 20310 : loss : 0.075342, loss_ce: 0.098131 iteration 20320 : loss : 0.067553, loss_ce: 0.075974 iteration 20330 : loss : 0.082162, loss_ce: 0.105519 iteration 20340 : loss : 0.066941, loss_ce: 0.085058 iteration 20350 : loss : 0.056671, loss_ce: 0.069271 iteration 20360 : loss : 0.072536, loss_ce: 0.087931 iteration 20366 : loss : 0.058708, loss_ce: 0.066837 iteration 20367 : loss : 0.076123, loss_ce: 0.098745 iteration 20368 : loss : 0.063106, loss_ce: 0.080087 iteration 20369 : loss : 0.086341, loss_ce: 0.111325 iteration 20370 : loss : 0.068307, loss_ce: 0.083657 iteration 20370 : loss : 0.068307, loss_ce: 0.083657 iteration 20371 : loss : 0.076436, loss_ce: 0.091486 iteration 20372 : loss : 0.109337, loss_ce: 0.151054 iteration 20373 : loss : 0.074243, loss_ce: 0.098360 iteration 20374 : loss : 0.069033, loss_ce: 0.083821 iteration 20375 : loss : 0.068603, loss_ce: 0.092129 save model to model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_162.pth ------Training Stats------ Training finished in 15089.67 seconds (251.49 minutes). Average time per iteration: 0.74s/it Average loss: 0.0768
Testing¶
def calculate_metric_percase(pred, gt): # Bug in Orignal Code Fixed
pred[pred > 0] = 1
gt[gt > 0] = 1
max_distance = np.sqrt(224**2 + 224**2)
if pred.sum() > 0 and gt.sum()>0:
dice = metric.binary.dc(pred, gt)
hd95 = metric.binary.hd95(pred, gt)
return dice, hd95
elif pred.sum() > 0 and gt.sum() == 0:
# False positives: prediction has buildings but ground truth doesn't
return 0, max_distance # or return 0, 373.13 (max possible HD95 for 224x224)
elif pred.sum() == 0 and gt.sum() > 0:
# False negatives: ground truth has buildings but prediction doesn't
return 0, max_distance # or return 0, 373.13
else:
# Both prediction and ground truth are empty - perfect agreement
return 1, 0
def test_single_volume(image_tensor, label_tensor, net, classes, patch_size=[224, 224], test_save_path=None, case=None):
# Gemeni Pro 2.5 Rewote most of this function inorder to debug an issue with the original code & Fix Demensiion Problems
# image_tensor is image_batch from DataLoader, shape (1, 3, 224, 224), PyTorch tensor on CPU
# label_tensor is label_batch from DataLoader, shape (1, 1, 224, 224), PyTorch tensor on CPU
# Determine device from model:
device = next(net.parameters()).device
# Prepare input for the network: move to device, ensure correct dtype
input_for_net = image_tensor.to(device).float()
# Prepare label for metrics: convert to NumPy, squeeze batch and channel dimensions.
label_np_for_metrics = label_tensor.squeeze(0).squeeze(0).cpu().detach().numpy()
net.eval() # Set model to evaluation mode
with torch.no_grad(): # Disable gradient calculations
# Forward pass: input_for_net is 4D (1, 3, 224, 224)
outputs = net(input_for_net) # Expected output shape: (1, num_classes, H, W)
# Get prediction: apply softmax, then argmax. Output shape (1, H, W) then (H,W)
prediction_np = torch.argmax(torch.softmax(outputs, dim=1), dim=1).squeeze(0).cpu().detach().numpy()
metric_list = []
# Ensure class 0 (background) is not included if your classes are 0 and 1 and metrics are for foreground (class 1)
for i in range(1, classes): # Iterates only for foreground class if classes=2 (0=bg, 1=fg)
# Both prediction_np and label_np_for_metrics are (H, W)
metric_list.append(calculate_metric_percase(prediction_np == i, label_np_for_metrics == i))
if test_save_path is not None and case is not None:
# image_tensor is (1, 3, 224, 224). Squeeze batch dim -> (3, 224, 224)
image_np_for_saving = image_tensor.squeeze(0).cpu().detach().numpy()
# Ensure SimpleITK gets NumPy arrays with correct types
img_itk = sitk.GetImageFromArray(image_np_for_saving.astype(np.float32))
prd_itk = sitk.GetImageFromArray(prediction_np.astype(np.float32))
lab_itk = sitk.GetImageFromArray(label_np_for_metrics.astype(np.float32))
# The test_save_path directory is created before calling inference.
sitk.WriteImage(prd_itk, os.path.join(test_save_path, str(case) + "_pred.nii.gz"))
sitk.WriteImage(img_itk, os.path.join(test_save_path, str(case) + "_img.nii.gz"))
sitk.WriteImage(lab_itk, os.path.join(test_save_path, str(case) + "_gt.nii.gz"))
return metric_list
import argparse
import logging
import os
import random
import sys
import numpy as np
import torch
import torch.backends.cudnn as cudnn
import torch.nn as nn
from torch.utils.data import DataLoader
from tqdm.notebook import tqdm
from medpy import metric
from scipy.ndimage import zoom
import SimpleITK as sitk
parser = argparse.ArgumentParser()
parser.add_argument('--dataset', type=str, default='GF7')
parser.add_argument('--num_classes', type=int, default=2)
parser.add_argument('--max_iterations', type=int, default=30000)
parser.add_argument('--max_epochs', type=int, default=8)
parser.add_argument('--batch_size', type=int, default=4)
parser.add_argument('--n_gpu', type=int, default=1)
parser.add_argument('--is_savenii', action="store_true", help='whether to save results during inference', default=True)
parser.add_argument('--deterministic', type=int, default=1) # Make it 1 for reproducibility
parser.add_argument('--base_lr', type=float, default=0.01)
parser.add_argument('--img_size', type=int, default=224)
parser.add_argument('--seed', type=int, default=42)
parser.add_argument('--n_skip', type=int, default=3)
parser.add_argument('--vit_name', type=str, default='R50-ViT-B_16')
parser.add_argument('--vit_patches_size', type=int, default=16)
parser.add_argument('--test_save_dir', type=str, default='predictions', help='saving prediction as nii!')
# Add these two for GF7Dataset
parser.add_argument('--image_dir', type=str, help='Path to satellite images')
parser.add_argument('--mask_dir', type=str, help='Path to segmentation masks')
if epc is None:
epc = '50' # Default value for max_epochs
# epc = "XX" # Set this to the desired number of epochs to select specifc model
# Parse args manually for notebook
args = parser.parse_args(args=[
'--dataset', 'GF7',
'--num_classes', '2',
'--max_epochs', epc, # Should be 150
'--batch_size', '25',
'--n_gpu', '1',
'--base_lr', '0.001',
'--img_size', '224',
'--seed', '42',
'--n_skip', '3',
'--vit_name', 'R50-ViT-B_16',
'--vit_patches_size', '16',
'--test_save_dir', 'predictions',
'--is_savenii', # Only needs Just the flag
'--image_dir', 'data/GF-7 Building (3Bands)/Test/image', # Change this Back
'--mask_dir', 'data/GF-7 Building (3Bands)/Test/label' # Change This Back
])
print(args)
Namespace(dataset='GF7', num_classes=2, max_iterations=30000, max_epochs=163, batch_size=25, n_gpu=1, is_savenii=True, deterministic=1, base_lr=0.001, img_size=224, seed=42, n_skip=3, vit_name='R50-ViT-B_16', vit_patches_size=16, test_save_dir='predictions', image_dir='data/GF-7 Building (3Bands)/Test/image', mask_dir='data/GF-7 Building (3Bands)/Test/label')
def inference(args, model, test_save_path=None):
print("\n\nStarting Inference...")
print("Test Save Path:", test_save_path)
# Use GF7Dataset
db_test = GF7Dataset(
image_dir=args.image_dir,
mask_dir=args.mask_dir,
image_size=args.img_size,
transform=None # No transform for inference
)
testloader = DataLoader(db_test, batch_size=1, shuffle=False, num_workers=0)
print("The length of test set is: {}".format(len(db_test)))
logging.info("{} test iterations per epoch".format(len(testloader)))
model.eval() # Sets PyTorch Model to evaluation mode
metric_list = 0.0
metric_list_full = []
with tqdm(total=len(testloader), desc="Testing", ncols=500, leave=True) as pbar:
for i_batch, (image_batch, label_batch) in enumerate(testloader):
h, w = image_batch.size()[2:]
metric_i = test_single_volume(image_batch, label_batch, model, classes=args.num_classes, patch_size=[args.img_size, args.img_size],
test_save_path=test_save_path, case=str(i_batch))
metric_list += np.array(metric_i)
image_filename = os.path.basename(db_test.image_paths[i_batch])
metric_list_full.append({'filename': image_filename, 'metrics': metric_i})
# Log Every 15 iterations
if i_batch % 15 == 0:
logging.info('idx %d case %s mean_dice %f mean_hd95 %f' % (i_batch, str(i_batch), np.mean(metric_i, axis=0)[0], np.mean(metric_i, axis=0)[1]))
pbar.update(1)
metric_list = metric_list / len(db_test)
for i in range(1, args.num_classes):
logging.info('Mean class %d mean_dice %f mean_hd95 %f' % (i, metric_list[i-1][0], metric_list[i-1][1]))
performance = np.mean(metric_list, axis=0)[0]
mean_hd95 = np.mean(metric_list, axis=0)[1]
logging.info('Testing performance in best model: mean_dice : %f mean_hd95 : %f' % (performance, mean_hd95))
print('\n\n Testing Finished!')
return metric_list_full
if not args.deterministic:
cudnn.benchmark = True
cudnn.deterministic = False
else:
cudnn.benchmark = False
cudnn.deterministic = True
random.seed(args.seed)
np.random.seed(args.seed)
torch.manual_seed(args.seed)
torch.cuda.manual_seed(args.seed)
# -----------------------
# Dataset Configuration
# -----------------------
dataset_name = 'GF7'
dataset_config = {
'GF7': {
'image_dir': args.image_dir,
'mask_dir': args.mask_dir,
'num_classes': 2
}
}
if args.batch_size != 24 and args.batch_size % 6 == 0:
args.base_lr *= args.batch_size / 24
args.dataset = dataset_name
args.num_classes = dataset_config[dataset_name]['num_classes']
args.image_dir = dataset_config[dataset_name]['image_dir']
args.mask_dir = dataset_config[dataset_name]['mask_dir']
args.is_pretrain = True
args.exp = 'TU_' + dataset_name + str(args.img_size)
snapshot_path = "model/{}/{}".format(args.exp, 'TU')
snapshot_path = snapshot_path + '_pretrain' if args.is_pretrain else snapshot_path
snapshot_path += f"_{args.vit_name}_skip{args.n_skip}"
snapshot_path = snapshot_path + '_vitpatch' + str(args.vit_patches_size) if args.vit_patches_size!=16 else snapshot_path
snapshot_path = snapshot_path + '_epo' + str(args.max_epochs) if args.max_epochs != 30 else snapshot_path
snapshot_path = snapshot_path+'_bs'+str(args.batch_size)
snapshot_path = snapshot_path + '_lr' + str(args.base_lr) if args.base_lr != 0.01 else snapshot_path
snapshot_path = snapshot_path + '_'+str(args.img_size)
snapshot_path = snapshot_path + '_s'+str(args.seed) if args.seed!=1234 else snapshot_path
# Create snapshot directory
if not os.path.exists(snapshot_path):
os.makedirs(snapshot_path)
# -----------------------
# ViT Config and Model
# -----------------------
config_vit = CONFIGS[args.vit_name]
config_vit.n_classes = args.num_classes
config_vit.n_skip = args.n_skip
# This is an Addition That needs to be Checked and Look at the Train As well
config_vit.patches.size = (args.vit_patches_size, args.vit_patches_size)
if 'R50' in args.vit_name:
grid_size = int(args.img_size / args.vit_patches_size)
config_vit.patches.grid = (grid_size, grid_size)
# Build model
net = VisionTransformer(config_vit, img_size=args.img_size, num_classes=config_vit.n_classes).to(device)
snapshot = os.path.join(snapshot_path, 'best_model.pth')
if not os.path.exists(snapshot): snapshot = snapshot.replace('best_model', 'epoch_'+str(args.max_epochs-1))
net.load_state_dict(torch.load(snapshot.replace('\\', '/')))
snapshot_name = snapshot_path.split('/')[-1]
log_folder = './test_log/test_log_' + args.exp
os.makedirs(log_folder, exist_ok=True)
logging.basicConfig(filename=log_folder + '/'+snapshot_name+".txt", level=logging.INFO, format='[%(asctime)s.%(msecs)03d] %(message)s', datefmt='%H:%M:%S')
logging.getLogger().addHandler(logging.StreamHandler(sys.stdout))
logging.info(str(args))
logging.info(snapshot_name)
if args.is_savenii:
args.test_save_dir = 'predictions'
test_save_path = os.path.join(args.test_save_dir, args.exp, snapshot_name)
os.makedirs(test_save_path, exist_ok=True)
else:
test_save_path = None
test_results = inference(args, net, test_save_path)
Namespace(dataset='GF7', num_classes=2, max_iterations=30000, max_epochs=163, batch_size=25, n_gpu=1, is_savenii=True, deterministic=1, base_lr=0.001, img_size=224, seed=42, n_skip=3, vit_name='R50-ViT-B_16', vit_patches_size=16, test_save_dir='predictions', image_dir='data/GF-7 Building (3Bands)/Test/image', mask_dir='data/GF-7 Building (3Bands)/Test/label', is_pretrain=True, exp='TU_GF7224') Namespace(dataset='GF7', num_classes=2, max_iterations=30000, max_epochs=163, batch_size=25, n_gpu=1, is_savenii=True, deterministic=1, base_lr=0.001, img_size=224, seed=42, n_skip=3, vit_name='R50-ViT-B_16', vit_patches_size=16, test_save_dir='predictions', image_dir='data/GF-7 Building (3Bands)/Test/image', mask_dir='data/GF-7 Building (3Bands)/Test/label', is_pretrain=True, exp='TU_GF7224') Namespace(dataset='GF7', num_classes=2, max_iterations=30000, max_epochs=163, batch_size=25, n_gpu=1, is_savenii=True, deterministic=1, base_lr=0.001, img_size=224, seed=42, n_skip=3, vit_name='R50-ViT-B_16', vit_patches_size=16, test_save_dir='predictions', image_dir='data/GF-7 Building (3Bands)/Test/image', mask_dir='data/GF-7 Building (3Bands)/Test/label', is_pretrain=True, exp='TU_GF7224') TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42 TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42 TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42 Starting Inference... Test Save Path: predictions\TU_GF7224\TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42 The length of test set is: 1035 1035 test iterations per epoch 1035 test iterations per epoch 1035 test iterations per epoch
Testing: 0%| ā¦
idx 0 case 0 mean_dice 0.929350 mean_hd95 8.944272 idx 0 case 0 mean_dice 0.929350 mean_hd95 8.944272 idx 0 case 0 mean_dice 0.929350 mean_hd95 8.944272 idx 0 case 0 mean_dice 0.929350 mean_hd95 8.944272 idx 0 case 0 mean_dice 0.929350 mean_hd95 8.944272 idx 15 case 15 mean_dice 0.914220 mean_hd95 6.324555 idx 15 case 15 mean_dice 0.914220 mean_hd95 6.324555 idx 15 case 15 mean_dice 0.914220 mean_hd95 6.324555 idx 30 case 30 mean_dice 0.800000 mean_hd95 9.337485 idx 30 case 30 mean_dice 0.800000 mean_hd95 9.337485 idx 30 case 30 mean_dice 0.800000 mean_hd95 9.337485 idx 45 case 45 mean_dice 0.842473 mean_hd95 68.000000 idx 45 case 45 mean_dice 0.842473 mean_hd95 68.000000 idx 45 case 45 mean_dice 0.842473 mean_hd95 68.000000 idx 60 case 60 mean_dice 0.784546 mean_hd95 8.062258 idx 60 case 60 mean_dice 0.784546 mean_hd95 8.062258 idx 60 case 60 mean_dice 0.784546 mean_hd95 8.062258 idx 75 case 75 mean_dice 0.621569 mean_hd95 37.913413 idx 75 case 75 mean_dice 0.621569 mean_hd95 37.913413 idx 75 case 75 mean_dice 0.621569 mean_hd95 37.913413 idx 90 case 90 mean_dice 1.000000 mean_hd95 0.000000 idx 90 case 90 mean_dice 1.000000 mean_hd95 0.000000 idx 90 case 90 mean_dice 1.000000 mean_hd95 0.000000 idx 105 case 105 mean_dice 0.810729 mean_hd95 5.099020 idx 105 case 105 mean_dice 0.810729 mean_hd95 5.099020 idx 105 case 105 mean_dice 0.810729 mean_hd95 5.099020 idx 120 case 120 mean_dice 0.782405 mean_hd95 8.062258 idx 120 case 120 mean_dice 0.782405 mean_hd95 8.062258 idx 120 case 120 mean_dice 0.782405 mean_hd95 8.062258 idx 135 case 135 mean_dice 0.865285 mean_hd95 4.242641 idx 135 case 135 mean_dice 0.865285 mean_hd95 4.242641 idx 135 case 135 mean_dice 0.865285 mean_hd95 4.242641 idx 150 case 150 mean_dice 0.802120 mean_hd95 7.155089 idx 150 case 150 mean_dice 0.802120 mean_hd95 7.155089 idx 150 case 150 mean_dice 0.802120 mean_hd95 7.155089 idx 165 case 165 mean_dice 0.678227 mean_hd95 7.071068 idx 165 case 165 mean_dice 0.678227 mean_hd95 7.071068 idx 165 case 165 mean_dice 0.678227 mean_hd95 7.071068 idx 180 case 180 mean_dice 0.855857 mean_hd95 6.000000 idx 180 case 180 mean_dice 0.855857 mean_hd95 6.000000 idx 180 case 180 mean_dice 0.855857 mean_hd95 6.000000 idx 195 case 195 mean_dice 0.752144 mean_hd95 36.228031 idx 195 case 195 mean_dice 0.752144 mean_hd95 36.228031 idx 195 case 195 mean_dice 0.752144 mean_hd95 36.228031 idx 210 case 210 mean_dice 0.829974 mean_hd95 5.000000 idx 210 case 210 mean_dice 0.829974 mean_hd95 5.000000 idx 210 case 210 mean_dice 0.829974 mean_hd95 5.000000 idx 225 case 225 mean_dice 1.000000 mean_hd95 0.000000 idx 225 case 225 mean_dice 1.000000 mean_hd95 0.000000 idx 225 case 225 mean_dice 1.000000 mean_hd95 0.000000 idx 240 case 240 mean_dice 0.909986 mean_hd95 16.000000 idx 240 case 240 mean_dice 0.909986 mean_hd95 16.000000 idx 240 case 240 mean_dice 0.909986 mean_hd95 16.000000 idx 255 case 255 mean_dice 0.859009 mean_hd95 10.440307 idx 255 case 255 mean_dice 0.859009 mean_hd95 10.440307 idx 255 case 255 mean_dice 0.859009 mean_hd95 10.440307 idx 270 case 270 mean_dice 0.806375 mean_hd95 2.236068 idx 270 case 270 mean_dice 0.806375 mean_hd95 2.236068 idx 270 case 270 mean_dice 0.806375 mean_hd95 2.236068 idx 285 case 285 mean_dice 0.805513 mean_hd95 18.343830 idx 285 case 285 mean_dice 0.805513 mean_hd95 18.343830 idx 285 case 285 mean_dice 0.805513 mean_hd95 18.343830 idx 300 case 300 mean_dice 0.926138 mean_hd95 8.892980 idx 300 case 300 mean_dice 0.926138 mean_hd95 8.892980 idx 300 case 300 mean_dice 0.926138 mean_hd95 8.892980 idx 315 case 315 mean_dice 0.783092 mean_hd95 3.605551 idx 315 case 315 mean_dice 0.783092 mean_hd95 3.605551 idx 315 case 315 mean_dice 0.783092 mean_hd95 3.605551 idx 330 case 330 mean_dice 0.747378 mean_hd95 4.000000 idx 330 case 330 mean_dice 0.747378 mean_hd95 4.000000 idx 330 case 330 mean_dice 0.747378 mean_hd95 4.000000 idx 345 case 345 mean_dice 0.928267 mean_hd95 7.615773 idx 345 case 345 mean_dice 0.928267 mean_hd95 7.615773 idx 345 case 345 mean_dice 0.928267 mean_hd95 7.615773 idx 360 case 360 mean_dice 0.978875 mean_hd95 24.634171 idx 360 case 360 mean_dice 0.978875 mean_hd95 24.634171 idx 360 case 360 mean_dice 0.978875 mean_hd95 24.634171 idx 375 case 375 mean_dice 0.559461 mean_hd95 24.000000 idx 375 case 375 mean_dice 0.559461 mean_hd95 24.000000 idx 375 case 375 mean_dice 0.559461 mean_hd95 24.000000 idx 390 case 390 mean_dice 0.867404 mean_hd95 3.605551 idx 390 case 390 mean_dice 0.867404 mean_hd95 3.605551 idx 390 case 390 mean_dice 0.867404 mean_hd95 3.605551 idx 405 case 405 mean_dice 0.820702 mean_hd95 11.133097 idx 405 case 405 mean_dice 0.820702 mean_hd95 11.133097 idx 405 case 405 mean_dice 0.820702 mean_hd95 11.133097 idx 420 case 420 mean_dice 0.000000 mean_hd95 316.783838 idx 420 case 420 mean_dice 0.000000 mean_hd95 316.783838 idx 420 case 420 mean_dice 0.000000 mean_hd95 316.783838 idx 435 case 435 mean_dice 1.000000 mean_hd95 0.000000 idx 435 case 435 mean_dice 1.000000 mean_hd95 0.000000 idx 435 case 435 mean_dice 1.000000 mean_hd95 0.000000 idx 450 case 450 mean_dice 1.000000 mean_hd95 0.000000 idx 450 case 450 mean_dice 1.000000 mean_hd95 0.000000 idx 450 case 450 mean_dice 1.000000 mean_hd95 0.000000 idx 465 case 465 mean_dice 0.863992 mean_hd95 5.385165 idx 465 case 465 mean_dice 0.863992 mean_hd95 5.385165 idx 465 case 465 mean_dice 0.863992 mean_hd95 5.385165 idx 480 case 480 mean_dice 0.890754 mean_hd95 9.211336 idx 480 case 480 mean_dice 0.890754 mean_hd95 9.211336 idx 480 case 480 mean_dice 0.890754 mean_hd95 9.211336 idx 495 case 495 mean_dice 0.778914 mean_hd95 12.136657 idx 495 case 495 mean_dice 0.778914 mean_hd95 12.136657 idx 495 case 495 mean_dice 0.778914 mean_hd95 12.136657 idx 510 case 510 mean_dice 0.826446 mean_hd95 1.269239 idx 510 case 510 mean_dice 0.826446 mean_hd95 1.269239 idx 510 case 510 mean_dice 0.826446 mean_hd95 1.269239 idx 525 case 525 mean_dice 0.854645 mean_hd95 5.099020 idx 525 case 525 mean_dice 0.854645 mean_hd95 5.099020 idx 525 case 525 mean_dice 0.854645 mean_hd95 5.099020 idx 540 case 540 mean_dice 1.000000 mean_hd95 0.000000 idx 540 case 540 mean_dice 1.000000 mean_hd95 0.000000 idx 540 case 540 mean_dice 1.000000 mean_hd95 0.000000 idx 555 case 555 mean_dice 0.793263 mean_hd95 5.000000 idx 555 case 555 mean_dice 0.793263 mean_hd95 5.000000 idx 555 case 555 mean_dice 0.793263 mean_hd95 5.000000 idx 570 case 570 mean_dice 0.928225 mean_hd95 7.280110 idx 570 case 570 mean_dice 0.928225 mean_hd95 7.280110 idx 570 case 570 mean_dice 0.928225 mean_hd95 7.280110 idx 585 case 585 mean_dice 0.869702 mean_hd95 3.162278 idx 585 case 585 mean_dice 0.869702 mean_hd95 3.162278 idx 585 case 585 mean_dice 0.869702 mean_hd95 3.162278 idx 600 case 600 mean_dice 0.765902 mean_hd95 38.208501 idx 600 case 600 mean_dice 0.765902 mean_hd95 38.208501 idx 600 case 600 mean_dice 0.765902 mean_hd95 38.208501 idx 615 case 615 mean_dice 0.873792 mean_hd95 3.605551 idx 615 case 615 mean_dice 0.873792 mean_hd95 3.605551 idx 615 case 615 mean_dice 0.873792 mean_hd95 3.605551 idx 630 case 630 mean_dice 0.821119 mean_hd95 9.219544 idx 630 case 630 mean_dice 0.821119 mean_hd95 9.219544 idx 630 case 630 mean_dice 0.821119 mean_hd95 9.219544 idx 645 case 645 mean_dice 0.779711 mean_hd95 5.830952 idx 645 case 645 mean_dice 0.779711 mean_hd95 5.830952 idx 645 case 645 mean_dice 0.779711 mean_hd95 5.830952 idx 660 case 660 mean_dice 0.851018 mean_hd95 4.000000 idx 660 case 660 mean_dice 0.851018 mean_hd95 4.000000 idx 660 case 660 mean_dice 0.851018 mean_hd95 4.000000 idx 675 case 675 mean_dice 0.880381 mean_hd95 2.236068 idx 675 case 675 mean_dice 0.880381 mean_hd95 2.236068 idx 675 case 675 mean_dice 0.880381 mean_hd95 2.236068 idx 690 case 690 mean_dice 0.736128 mean_hd95 14.406547 idx 690 case 690 mean_dice 0.736128 mean_hd95 14.406547 idx 690 case 690 mean_dice 0.736128 mean_hd95 14.406547 idx 705 case 705 mean_dice 0.854046 mean_hd95 15.578802 idx 705 case 705 mean_dice 0.854046 mean_hd95 15.578802 idx 705 case 705 mean_dice 0.854046 mean_hd95 15.578802 idx 720 case 720 mean_dice 0.808097 mean_hd95 17.000000 idx 720 case 720 mean_dice 0.808097 mean_hd95 17.000000 idx 720 case 720 mean_dice 0.808097 mean_hd95 17.000000 idx 735 case 735 mean_dice 0.749478 mean_hd95 16.278821 idx 735 case 735 mean_dice 0.749478 mean_hd95 16.278821 idx 735 case 735 mean_dice 0.749478 mean_hd95 16.278821 idx 750 case 750 mean_dice 0.823380 mean_hd95 5.830952 idx 750 case 750 mean_dice 0.823380 mean_hd95 5.830952 idx 750 case 750 mean_dice 0.823380 mean_hd95 5.830952 idx 765 case 765 mean_dice 0.867295 mean_hd95 3.162278 idx 765 case 765 mean_dice 0.867295 mean_hd95 3.162278 idx 765 case 765 mean_dice 0.867295 mean_hd95 3.162278 idx 780 case 780 mean_dice 0.810423 mean_hd95 7.000000 idx 780 case 780 mean_dice 0.810423 mean_hd95 7.000000 idx 780 case 780 mean_dice 0.810423 mean_hd95 7.000000 idx 795 case 795 mean_dice 0.606787 mean_hd95 15.626816 idx 795 case 795 mean_dice 0.606787 mean_hd95 15.626816 idx 795 case 795 mean_dice 0.606787 mean_hd95 15.626816 idx 810 case 810 mean_dice 0.700588 mean_hd95 21.931712 idx 810 case 810 mean_dice 0.700588 mean_hd95 21.931712 idx 810 case 810 mean_dice 0.700588 mean_hd95 21.931712 idx 825 case 825 mean_dice 0.923583 mean_hd95 10.037407 idx 825 case 825 mean_dice 0.923583 mean_hd95 10.037407 idx 825 case 825 mean_dice 0.923583 mean_hd95 10.037407 idx 840 case 840 mean_dice 0.781846 mean_hd95 4.000000 idx 840 case 840 mean_dice 0.781846 mean_hd95 4.000000 idx 840 case 840 mean_dice 0.781846 mean_hd95 4.000000 idx 855 case 855 mean_dice 0.792012 mean_hd95 10.158431 idx 855 case 855 mean_dice 0.792012 mean_hd95 10.158431 idx 855 case 855 mean_dice 0.792012 mean_hd95 10.158431 idx 870 case 870 mean_dice 0.884778 mean_hd95 5.000000 idx 870 case 870 mean_dice 0.884778 mean_hd95 5.000000 idx 870 case 870 mean_dice 0.884778 mean_hd95 5.000000 idx 885 case 885 mean_dice 0.829584 mean_hd95 13.084221 idx 885 case 885 mean_dice 0.829584 mean_hd95 13.084221 idx 885 case 885 mean_dice 0.829584 mean_hd95 13.084221 idx 900 case 900 mean_dice 0.910737 mean_hd95 4.242641 idx 900 case 900 mean_dice 0.910737 mean_hd95 4.242641 idx 900 case 900 mean_dice 0.910737 mean_hd95 4.242641 idx 915 case 915 mean_dice 0.000000 mean_hd95 316.783838 idx 915 case 915 mean_dice 0.000000 mean_hd95 316.783838 idx 915 case 915 mean_dice 0.000000 mean_hd95 316.783838 idx 930 case 930 mean_dice 0.819588 mean_hd95 5.830952 idx 930 case 930 mean_dice 0.819588 mean_hd95 5.830952 idx 930 case 930 mean_dice 0.819588 mean_hd95 5.830952 idx 945 case 945 mean_dice 1.000000 mean_hd95 0.000000 idx 945 case 945 mean_dice 1.000000 mean_hd95 0.000000 idx 945 case 945 mean_dice 1.000000 mean_hd95 0.000000 idx 960 case 960 mean_dice 1.000000 mean_hd95 0.000000 idx 960 case 960 mean_dice 1.000000 mean_hd95 0.000000 idx 960 case 960 mean_dice 1.000000 mean_hd95 0.000000 idx 975 case 975 mean_dice 0.000000 mean_hd95 316.783838 idx 975 case 975 mean_dice 0.000000 mean_hd95 316.783838 idx 975 case 975 mean_dice 0.000000 mean_hd95 316.783838 idx 990 case 990 mean_dice 0.843447 mean_hd95 7.071068 idx 990 case 990 mean_dice 0.843447 mean_hd95 7.071068 idx 990 case 990 mean_dice 0.843447 mean_hd95 7.071068 idx 1005 case 1005 mean_dice 0.787607 mean_hd95 9.992443 idx 1005 case 1005 mean_dice 0.787607 mean_hd95 9.992443 idx 1005 case 1005 mean_dice 0.787607 mean_hd95 9.992443 idx 1020 case 1020 mean_dice 0.873135 mean_hd95 4.000000 idx 1020 case 1020 mean_dice 0.873135 mean_hd95 4.000000 idx 1020 case 1020 mean_dice 0.873135 mean_hd95 4.000000 Mean class 1 mean_dice 0.751649 mean_hd95 27.417227 Mean class 1 mean_dice 0.751649 mean_hd95 27.417227 Mean class 1 mean_dice 0.751649 mean_hd95 27.417227 Testing performance in best model: mean_dice : 0.751649 mean_hd95 : 27.417227 Testing performance in best model: mean_dice : 0.751649 mean_hd95 : 27.417227 Testing performance in best model: mean_dice : 0.751649 mean_hd95 : 27.417227 Testing Finished!
Analysing the Model Results¶
test_results
# Histagram of Dice Coefficient
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# Convert Test Results to Pandas DataFrame
test_results_df = pd.DataFrame([
{
'Filename': d['filename'],
'Dice Coefficient': d['metrics'][0][0],
'HD95': d['metrics'][0][1]
}
for d in test_results
])
test_results_df['City'] = test_results_df['Filename'].apply(lambda x: x.split('_')[0])
test_results_df['Image'] = test_results_df['Filename'].apply(lambda x: x.split('_')[1])
print("Median Scores:", test_results_df[['Dice Coefficient', 'HD95']].median())
print("Uper Quartile Scores:", test_results_df[['Dice Coefficient', 'HD95']].quantile(0.75))
print("Lower Quartile Scores:", test_results_df[['Dice Coefficient', 'HD95']].quantile(0.25))
# Plot histogram of Dice Coefficient
plt.figure(figsize=(10, 6))
sns.histplot(test_results_df['Dice Coefficient'], bins=20, kde=True)
plt.title('Histogram of Dice Coefficient')
plt.xlabel('Dice Coefficient')
plt.ylabel('Frequency')
plt.grid()
plt.show()
# Plot histogram of HD95
plt.figure(figsize=(10, 6))
sns.histplot(test_results_df['HD95'], bins=20, kde=True, color='orange')
median_hd95 = np.median(test_results_df['HD95'])
plt.title('Histogram of HD95')
plt.xlabel('HD95')
plt.ylabel('Frequency')
plt.grid()
plt.show()
Median Scores: Dice Coefficient 0.815338 HD95 8.000000 dtype: float64 Uper Quartile Scores: Dice Coefficient 0.869990 HD95 16.855625 Name: 0.75, dtype: float64 Lower Quartile Scores: Dice Coefficient 0.728367 HD95 5.000000 Name: 0.25, dtype: float64
Scores By City¶
# See Cties
test_results_df['City'].unique()
array(['Chongqing', 'Guangzhou', 'Lanzhou', 'Ningbo', 'Shenzhen',
'Tianjin'], dtype=object)
# Scores By City
by_city_result = test_results_df.groupby('City').agg({
'Dice Coefficient': ['mean', 'std'],
'HD95': ['mean', 'std']
}).reset_index()
# Rename columns for clarity
by_city_result.columns = ['City', 'Dice Coefficient Mean', 'Dice Coefficient Std', 'HD95 Mean', 'HD95 Std']
# Add Row for Overall
overall_row = pd.DataFrame({
'City': ['Overall'],
'Dice Coefficient Mean': [test_results_df['Dice Coefficient'].mean()],
'Dice Coefficient Std': [test_results_df['Dice Coefficient'].std()],
'HD95 Mean': [test_results_df['HD95'].mean()],
'HD95 Std': [test_results_df['HD95'].std()],
})
by_city_result = pd.concat([by_city_result, overall_row], ignore_index=True)
# Display the results as a rendered table
print("Scores by City:")
display(by_city_result)
# Histogram of Dice Coefficient by City
plt.figure(figsize=(10, 6))
sns.kdeplot(data=test_results_df, x='Dice Coefficient', hue='City')
plt.title('KDE Plot of Dice Coefficient by City')
plt.xlabel('Dice Coefficient')
plt.ylabel('Density')
plt.grid()
plt.tight_layout()
plt.show()
# Histogram of HD95 by City
plt.figure(figsize=(10, 6))
sns.kdeplot(data=test_results_df, x='HD95', hue='City')
plt.title('KDE Plot of HD95 by City')
plt.xlabel('HD95')
plt.ylabel('Density')
plt.grid()
plt.tight_layout()
plt.show()
Scores by City:
| City | Dice Coefficient Mean | Dice Coefficient Std | HD95 Mean | HD95 Std | |
|---|---|---|---|---|---|
| 0 | Chongqing | 0.744319 | 0.222668 | 26.693538 | 59.478517 |
| 1 | Guangzhou | 0.722874 | 0.227968 | 29.421031 | 56.687649 |
| 2 | Lanzhou | 0.766654 | 0.249940 | 33.305769 | 74.004024 |
| 3 | Ningbo | 0.785484 | 0.146918 | 15.573055 | 23.415616 |
| 4 | Shenzhen | 0.742552 | 0.176550 | 23.645208 | 51.723232 |
| 5 | Tianjin | 0.752744 | 0.262597 | 34.507415 | 79.728446 |
| 6 | Overall | 0.751649 | 0.220762 | 27.417227 | 61.267373 |
First 5 and Last 5 Cases¶
def plot_prediction(idx, title):
# Load image, prediction, and ground truth from saved .nii.gz files
img_path = os.path.join(test_save_path, f"{idx}_img.nii.gz")
pred_path = os.path.join(test_save_path, f"{idx}_pred.nii.gz")
gt_path = os.path.join(test_save_path, f"{idx}_gt.nii.gz")
img = sitk.GetArrayFromImage(sitk.ReadImage(img_path))
pred = sitk.GetArrayFromImage(sitk.ReadImage(pred_path))
gt = sitk.GetArrayFromImage(sitk.ReadImage(gt_path))
plt.figure(figsize=(15, 4))
plt.suptitle(title, fontsize=16)
plt.subplot(1, 3, 1)
if img.shape[0] == 3:
# Normalize to [0, 1] for RGB display
img_disp = img.astype(np.float32)
img_disp = (img_disp - img_disp.min()) / (img_disp.max() - img_disp.min() + 1e-8)
plt.imshow(img_disp.transpose(1, 2, 0))
else:
plt.imshow(img[0], cmap='gray')
plt.title("Image")
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(pred, cmap='gray')
plt.title("Prediction")
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(gt, cmap='gray')
plt.title("Ground Truth")
plt.axis('off')
# Add Dice Coefficient and HD95 to the title
dice = test_results_df.iloc[idx]['Dice Coefficient']
hd95 = test_results_df.iloc[idx]['HD95']
City = test_results_df.iloc[idx]['City']
plt.suptitle(f"{title} : {City} - Dice: {dice:.4f}, HD95: {hd95:.4f}", fontsize=16)
# Increase padding between suptitle and subplots
plt.subplots_adjust(top=0.85)
plt.tight_layout()
plt.show()
# Plot predictions for the first 5 cases
for i in range(5):
plot_prediction(i, f"Prediction for Case {i}")
# Plot predictions for the last 5 cases
for i in range(len(test_results)-5, len(test_results)):
plot_prediction(i, f"Prediction for Case {i}")
Dice Score 1 vs 0¶
# Best 5 and Worst 5 Cases images
# Get the indices of the best and worst 5 cases based on Dice Coefficient
best_5_cases = test_results_df['Dice Coefficient'].nlargest(5).index
worst_5_cases = test_results_df['Dice Coefficient'].nsmallest(5).index
# Plot the best 5 cases
print("Best 5 Cases (Highest Dice Coefficient):", best_5_cases.tolist())
for i in best_5_cases:
plot_prediction(i, f"Best Case {i}")
# Plot the worst 5 cases
print("Worst 5 Cases (Lowest Dice Coefficient):", worst_5_cases.tolist())
for i in worst_5_cases:
plot_prediction(i, f"Worst Case {i}")
Best 5 Cases (Highest Dice Coefficient): [6, 18, 23, 53, 58]
Worst 5 Cases (Lowest Dice Coefficient): [13, 66, 76, 79, 85]
Best and Worst Scores Ignoring 0 and 1 Scores¶
# Best 5 and Worst 5 Cases images
# Get the indices of the best and worst 5 cases based on Dice Coefficient, ignoring perfect (1.0) and zero (0.0) scores
filtered_df = test_results_df[(test_results_df['Dice Coefficient'] < 1.0) & (test_results_df['Dice Coefficient'] > 0.0)]
best_5_cases = filtered_df['Dice Coefficient'].nlargest(5).index
worst_5_cases = filtered_df['Dice Coefficient'].nsmallest(5).index
# Plot the best 5 cases
print("Best 5 Cases (Highest Dice Coefficient):")
for i in best_5_cases:
plot_prediction(i, f"Best Case {i}")
# Plot the worst 5 cases
print("Worst 5 Cases (Lowest Dice Coefficient):")
for i in worst_5_cases:
plot_prediction(i, f"Worst Case {i}")
Best 5 Cases (Highest Dice Coefficient):
Worst 5 Cases (Lowest Dice Coefficient):
Model Structure¶
from torchsummary import summary
# Visualize the VisionTransformer model structure
# Use a sample input size matching your data (e.g., 3x224x224 for RGB images)
summary(net, input_size=(3, args.img_size, args.img_size), device=str(device))
# from torchviz import make_dot
# import graphviz
# # Create a dummy input matching model's input shape
# dummy_input = torch.randn(1, 3, args.img_size, args.img_size).to(device)
# # Forward pass to get the output
# output = net(dummy_input)
# Create the visualization
#dot = make_dot(output, params=dict(net.named_parameters()))
#dot.render("model_visualization", format="pdf")
=========================================================================== Layer (type:depth-idx) Param # =========================================================================== āāTransformer: 1-1 -- | āāEmbeddings: 2-1 -- | | āāResNetV2: 3-1 11,894,848 | | āāConv2d: 3-2 787,200 | | āāDropout: 3-3 -- | āāEncoder: 2-2 -- | | āāModuleList: 3-4 85,054,464 | | āāLayerNorm: 3-5 1,536 āāDecoderCup: 1-2 -- | āāConv2dReLU: 2-3 -- | | āāConv2d: 3-6 3,538,944 | | āāBatchNorm2d: 3-7 1,024 | | āāReLU: 3-8 -- | āāModuleList: 2-4 -- | | āāDecoderBlock: 3-9 2,950,144 | | āāDecoderBlock: 3-10 737,792 | | āāDecoderBlock: 3-11 147,712 | | āāDecoderBlock: 3-12 11,584 āāSegmentationHead: 1-3 -- | āāConv2d: 2-5 290 | āāIdentity: 2-6 -- =========================================================================== Total params: 105,125,538 Trainable params: 105,125,538 Non-trainable params: 0 ===========================================================================
=========================================================================== Layer (type:depth-idx) Param # =========================================================================== āāTransformer: 1-1 -- | āāEmbeddings: 2-1 -- | | āāResNetV2: 3-1 11,894,848 | | āāConv2d: 3-2 787,200 | | āāDropout: 3-3 -- | āāEncoder: 2-2 -- | | āāModuleList: 3-4 85,054,464 | | āāLayerNorm: 3-5 1,536 āāDecoderCup: 1-2 -- | āāConv2dReLU: 2-3 -- | | āāConv2d: 3-6 3,538,944 | | āāBatchNorm2d: 3-7 1,024 | | āāReLU: 3-8 -- | āāModuleList: 2-4 -- | | āāDecoderBlock: 3-9 2,950,144 | | āāDecoderBlock: 3-10 737,792 | | āāDecoderBlock: 3-11 147,712 | | āāDecoderBlock: 3-12 11,584 āāSegmentationHead: 1-3 -- | āāConv2d: 2-5 290 | āāIdentity: 2-6 -- =========================================================================== Total params: 105,125,538 Trainable params: 105,125,538 Non-trainable params: 0 ===========================================================================
Detailed Breakdown¶
summary(net, input_size=(3, args.img_size, args.img_size), device=str(device), depth=6)
================================================================================ Layer (type:depth-idx) Param # ================================================================================ āāTransformer: 1-1 -- | āāEmbeddings: 2-1 -- | | āāResNetV2: 3-1 -- | | | āāSequential: 4-1 -- | | | | āāStdConv2d: 5-1 9,408 | | | | āāGroupNorm: 5-2 128 | | | | āāReLU: 5-3 -- | | | āāSequential: 4-2 -- | | | | āāSequential: 5-4 -- | | | | | āāPreActBottleneck: 6-1 75,008 | | | | | āāPreActBottleneck: 6-2 70,400 | | | | | āāPreActBottleneck: 6-3 70,400 | | | | āāSequential: 5-5 -- | | | | | āāPreActBottleneck: 6-4 379,392 | | | | | āāPreActBottleneck: 6-5 280,064 | | | | | āāPreActBottleneck: 6-6 280,064 | | | | | āāPreActBottleneck: 6-7 280,064 | | | | āāSequential: 5-6 -- | | | | | āāPreActBottleneck: 6-8 1,512,448 | | | | | āāPreActBottleneck: 6-9 1,117,184 | | | | | āāPreActBottleneck: 6-10 1,117,184 | | | | | āāPreActBottleneck: 6-11 1,117,184 | | | | | āāPreActBottleneck: 6-12 1,117,184 | | | | | āāPreActBottleneck: 6-13 1,117,184 | | | | | āāPreActBottleneck: 6-14 1,117,184 | | | | | āāPreActBottleneck: 6-15 1,117,184 | | | | | āāPreActBottleneck: 6-16 1,117,184 | | āāConv2d: 3-2 787,200 | | āāDropout: 3-3 -- | āāEncoder: 2-2 -- | | āāModuleList: 3-4 -- | | | āāBlock: 4-3 -- | | | | āāLayerNorm: 5-7 1,536 | | | | āāLayerNorm: 5-8 1,536 | | | | āāMlp: 5-9 -- | | | | | āāLinear: 6-17 2,362,368 | | | | | āāLinear: 6-18 2,360,064 | | | | | āāDropout: 6-19 -- | | | | āāAttention: 5-10 -- | | | | | āāLinear: 6-20 590,592 | | | | | āāLinear: 6-21 590,592 | | | | | āāLinear: 6-22 590,592 | | | | | āāLinear: 6-23 590,592 | | | | | āāDropout: 6-24 -- | | | | | āāDropout: 6-25 -- | | | | | āāSoftmax: 6-26 -- | | | āāBlock: 4-4 -- | | | | āāLayerNorm: 5-11 1,536 | | | | āāLayerNorm: 5-12 1,536 | | | | āāMlp: 5-13 -- | | | | | āāLinear: 6-27 2,362,368 | | | | | āāLinear: 6-28 2,360,064 | | | | | āāDropout: 6-29 -- | | | | āāAttention: 5-14 -- | | | | | āāLinear: 6-30 590,592 | | | | | āāLinear: 6-31 590,592 | | | | | āāLinear: 6-32 590,592 | | | | | āāLinear: 6-33 590,592 | | | | | āāDropout: 6-34 -- | | | | | āāDropout: 6-35 -- | | | | | āāSoftmax: 6-36 -- | | | āāBlock: 4-5 -- | | | | āāLayerNorm: 5-15 1,536 | | | | āāLayerNorm: 5-16 1,536 | | | | āāMlp: 5-17 -- | | | | | āāLinear: 6-37 2,362,368 | | | | | āāLinear: 6-38 2,360,064 | | | | | āāDropout: 6-39 -- | | | | āāAttention: 5-18 -- | | | | | āāLinear: 6-40 590,592 | | | | | āāLinear: 6-41 590,592 | | | | | āāLinear: 6-42 590,592 | | | | | āāLinear: 6-43 590,592 | | | | | āāDropout: 6-44 -- | | | | | āāDropout: 6-45 -- | | | | | āāSoftmax: 6-46 -- | | | āāBlock: 4-6 -- | | | | āāLayerNorm: 5-19 1,536 | | | | āāLayerNorm: 5-20 1,536 | | | | āāMlp: 5-21 -- | | | | | āāLinear: 6-47 2,362,368 | | | | | āāLinear: 6-48 2,360,064 | | | | | āāDropout: 6-49 -- | | | | āāAttention: 5-22 -- | | | | | āāLinear: 6-50 590,592 | | | | | āāLinear: 6-51 590,592 | | | | | āāLinear: 6-52 590,592 | | | | | āāLinear: 6-53 590,592 | | | | | āāDropout: 6-54 -- | | | | | āāDropout: 6-55 -- | | | | | āāSoftmax: 6-56 -- | | | āāBlock: 4-7 -- | | | | āāLayerNorm: 5-23 1,536 | | | | āāLayerNorm: 5-24 1,536 | | | | āāMlp: 5-25 -- | | | | | āāLinear: 6-57 2,362,368 | | | | | āāLinear: 6-58 2,360,064 | | | | | āāDropout: 6-59 -- | | | | āāAttention: 5-26 -- | | | | | āāLinear: 6-60 590,592 | | | | | āāLinear: 6-61 590,592 | | | | | āāLinear: 6-62 590,592 | | | | | āāLinear: 6-63 590,592 | | | | | āāDropout: 6-64 -- | | | | | āāDropout: 6-65 -- | | | | | āāSoftmax: 6-66 -- | | | āāBlock: 4-8 -- | | | | āāLayerNorm: 5-27 1,536 | | | | āāLayerNorm: 5-28 1,536 | | | | āāMlp: 5-29 -- | | | | | āāLinear: 6-67 2,362,368 | | | | | āāLinear: 6-68 2,360,064 | | | | | āāDropout: 6-69 -- | | | | āāAttention: 5-30 -- | | | | | āāLinear: 6-70 590,592 | | | | | āāLinear: 6-71 590,592 | | | | | āāLinear: 6-72 590,592 | | | | | āāLinear: 6-73 590,592 | | | | | āāDropout: 6-74 -- | | | | | āāDropout: 6-75 -- | | | | | āāSoftmax: 6-76 -- | | | āāBlock: 4-9 -- | | | | āāLayerNorm: 5-31 1,536 | | | | āāLayerNorm: 5-32 1,536 | | | | āāMlp: 5-33 -- | | | | | āāLinear: 6-77 2,362,368 | | | | | āāLinear: 6-78 2,360,064 | | | | | āāDropout: 6-79 -- | | | | āāAttention: 5-34 -- | | | | | āāLinear: 6-80 590,592 | | | | | āāLinear: 6-81 590,592 | | | | | āāLinear: 6-82 590,592 | | | | | āāLinear: 6-83 590,592 | | | | | āāDropout: 6-84 -- | | | | | āāDropout: 6-85 -- | | | | | āāSoftmax: 6-86 -- | | | āāBlock: 4-10 -- | | | | āāLayerNorm: 5-35 1,536 | | | | āāLayerNorm: 5-36 1,536 | | | | āāMlp: 5-37 -- | | | | | āāLinear: 6-87 2,362,368 | | | | | āāLinear: 6-88 2,360,064 | | | | | āāDropout: 6-89 -- | | | | āāAttention: 5-38 -- | | | | | āāLinear: 6-90 590,592 | | | | | āāLinear: 6-91 590,592 | | | | | āāLinear: 6-92 590,592 | | | | | āāLinear: 6-93 590,592 | | | | | āāDropout: 6-94 -- | | | | | āāDropout: 6-95 -- | | | | | āāSoftmax: 6-96 -- | | | āāBlock: 4-11 -- | | | | āāLayerNorm: 5-39 1,536 | | | | āāLayerNorm: 5-40 1,536 | | | | āāMlp: 5-41 -- | | | | | āāLinear: 6-97 2,362,368 | | | | | āāLinear: 6-98 2,360,064 | | | | | āāDropout: 6-99 -- | | | | āāAttention: 5-42 -- | | | | | āāLinear: 6-100 590,592 | | | | | āāLinear: 6-101 590,592 | | | | | āāLinear: 6-102 590,592 | | | | | āāLinear: 6-103 590,592 | | | | | āāDropout: 6-104 -- | | | | | āāDropout: 6-105 -- | | | | | āāSoftmax: 6-106 -- | | | āāBlock: 4-12 -- | | | | āāLayerNorm: 5-43 1,536 | | | | āāLayerNorm: 5-44 1,536 | | | | āāMlp: 5-45 -- | | | | | āāLinear: 6-107 2,362,368 | | | | | āāLinear: 6-108 2,360,064 | | | | | āāDropout: 6-109 -- | | | | āāAttention: 5-46 -- | | | | | āāLinear: 6-110 590,592 | | | | | āāLinear: 6-111 590,592 | | | | | āāLinear: 6-112 590,592 | | | | | āāLinear: 6-113 590,592 | | | | | āāDropout: 6-114 -- | | | | | āāDropout: 6-115 -- | | | | | āāSoftmax: 6-116 -- | | | āāBlock: 4-13 -- | | | | āāLayerNorm: 5-47 1,536 | | | | āāLayerNorm: 5-48 1,536 | | | | āāMlp: 5-49 -- | | | | | āāLinear: 6-117 2,362,368 | | | | | āāLinear: 6-118 2,360,064 | | | | | āāDropout: 6-119 -- | | | | āāAttention: 5-50 -- | | | | | āāLinear: 6-120 590,592 | | | | | āāLinear: 6-121 590,592 | | | | | āāLinear: 6-122 590,592 | | | | | āāLinear: 6-123 590,592 | | | | | āāDropout: 6-124 -- | | | | | āāDropout: 6-125 -- | | | | | āāSoftmax: 6-126 -- | | | āāBlock: 4-14 -- | | | | āāLayerNorm: 5-51 1,536 | | | | āāLayerNorm: 5-52 1,536 | | | | āāMlp: 5-53 -- | | | | | āāLinear: 6-127 2,362,368 | | | | | āāLinear: 6-128 2,360,064 | | | | | āāDropout: 6-129 -- | | | | āāAttention: 5-54 -- | | | | | āāLinear: 6-130 590,592 | | | | | āāLinear: 6-131 590,592 | | | | | āāLinear: 6-132 590,592 | | | | | āāLinear: 6-133 590,592 | | | | | āāDropout: 6-134 -- | | | | | āāDropout: 6-135 -- | | | | | āāSoftmax: 6-136 -- | | āāLayerNorm: 3-5 1,536 āāDecoderCup: 1-2 -- | āāConv2dReLU: 2-3 -- | | āāConv2d: 3-6 3,538,944 | | āāBatchNorm2d: 3-7 1,024 | | āāReLU: 3-8 -- | āāModuleList: 2-4 -- | | āāDecoderBlock: 3-9 -- | | | āāConv2dReLU: 4-15 -- | | | | āāConv2d: 5-55 2,359,296 | | | | āāBatchNorm2d: 5-56 512 | | | | āāReLU: 5-57 -- | | | āāConv2dReLU: 4-16 -- | | | | āāConv2d: 5-58 589,824 | | | | āāBatchNorm2d: 5-59 512 | | | | āāReLU: 5-60 -- | | | āāUpsamplingBilinear2d: 4-17 -- | | āāDecoderBlock: 3-10 -- | | | āāConv2dReLU: 4-18 -- | | | | āāConv2d: 5-61 589,824 | | | | āāBatchNorm2d: 5-62 256 | | | | āāReLU: 5-63 -- | | | āāConv2dReLU: 4-19 -- | | | | āāConv2d: 5-64 147,456 | | | | āāBatchNorm2d: 5-65 256 | | | | āāReLU: 5-66 -- | | | āāUpsamplingBilinear2d: 4-20 -- | | āāDecoderBlock: 3-11 -- | | | āāConv2dReLU: 4-21 -- | | | | āāConv2d: 5-67 110,592 | | | | āāBatchNorm2d: 5-68 128 | | | | āāReLU: 5-69 -- | | | āāConv2dReLU: 4-22 -- | | | | āāConv2d: 5-70 36,864 | | | | āāBatchNorm2d: 5-71 128 | | | | āāReLU: 5-72 -- | | | āāUpsamplingBilinear2d: 4-23 -- | | āāDecoderBlock: 3-12 -- | | | āāConv2dReLU: 4-24 -- | | | | āāConv2d: 5-73 9,216 | | | | āāBatchNorm2d: 5-74 32 | | | | āāReLU: 5-75 -- | | | āāConv2dReLU: 4-25 -- | | | | āāConv2d: 5-76 2,304 | | | | āāBatchNorm2d: 5-77 32 | | | | āāReLU: 5-78 -- | | | āāUpsamplingBilinear2d: 4-26 -- āāSegmentationHead: 1-3 -- | āāConv2d: 2-5 290 | āāIdentity: 2-6 -- ================================================================================ Total params: 105,125,538 Trainable params: 105,125,538 Non-trainable params: 0 ================================================================================
================================================================================ Layer (type:depth-idx) Param # ================================================================================ āāTransformer: 1-1 -- | āāEmbeddings: 2-1 -- | | āāResNetV2: 3-1 -- | | | āāSequential: 4-1 -- | | | | āāStdConv2d: 5-1 9,408 | | | | āāGroupNorm: 5-2 128 | | | | āāReLU: 5-3 -- | | | āāSequential: 4-2 -- | | | | āāSequential: 5-4 -- | | | | | āāPreActBottleneck: 6-1 75,008 | | | | | āāPreActBottleneck: 6-2 70,400 | | | | | āāPreActBottleneck: 6-3 70,400 | | | | āāSequential: 5-5 -- | | | | | āāPreActBottleneck: 6-4 379,392 | | | | | āāPreActBottleneck: 6-5 280,064 | | | | | āāPreActBottleneck: 6-6 280,064 | | | | | āāPreActBottleneck: 6-7 280,064 | | | | āāSequential: 5-6 -- | | | | | āāPreActBottleneck: 6-8 1,512,448 | | | | | āāPreActBottleneck: 6-9 1,117,184 | | | | | āāPreActBottleneck: 6-10 1,117,184 | | | | | āāPreActBottleneck: 6-11 1,117,184 | | | | | āāPreActBottleneck: 6-12 1,117,184 | | | | | āāPreActBottleneck: 6-13 1,117,184 | | | | | āāPreActBottleneck: 6-14 1,117,184 | | | | | āāPreActBottleneck: 6-15 1,117,184 | | | | | āāPreActBottleneck: 6-16 1,117,184 | | āāConv2d: 3-2 787,200 | | āāDropout: 3-3 -- | āāEncoder: 2-2 -- | | āāModuleList: 3-4 -- | | | āāBlock: 4-3 -- | | | | āāLayerNorm: 5-7 1,536 | | | | āāLayerNorm: 5-8 1,536 | | | | āāMlp: 5-9 -- | | | | | āāLinear: 6-17 2,362,368 | | | | | āāLinear: 6-18 2,360,064 | | | | | āāDropout: 6-19 -- | | | | āāAttention: 5-10 -- | | | | | āāLinear: 6-20 590,592 | | | | | āāLinear: 6-21 590,592 | | | | | āāLinear: 6-22 590,592 | | | | | āāLinear: 6-23 590,592 | | | | | āāDropout: 6-24 -- | | | | | āāDropout: 6-25 -- | | | | | āāSoftmax: 6-26 -- | | | āāBlock: 4-4 -- | | | | āāLayerNorm: 5-11 1,536 | | | | āāLayerNorm: 5-12 1,536 | | | | āāMlp: 5-13 -- | | | | | āāLinear: 6-27 2,362,368 | | | | | āāLinear: 6-28 2,360,064 | | | | | āāDropout: 6-29 -- | | | | āāAttention: 5-14 -- | | | | | āāLinear: 6-30 590,592 | | | | | āāLinear: 6-31 590,592 | | | | | āāLinear: 6-32 590,592 | | | | | āāLinear: 6-33 590,592 | | | | | āāDropout: 6-34 -- | | | | | āāDropout: 6-35 -- | | | | | āāSoftmax: 6-36 -- | | | āāBlock: 4-5 -- | | | | āāLayerNorm: 5-15 1,536 | | | | āāLayerNorm: 5-16 1,536 | | | | āāMlp: 5-17 -- | | | | | āāLinear: 6-37 2,362,368 | | | | | āāLinear: 6-38 2,360,064 | | | | | āāDropout: 6-39 -- | | | | āāAttention: 5-18 -- | | | | | āāLinear: 6-40 590,592 | | | | | āāLinear: 6-41 590,592 | | | | | āāLinear: 6-42 590,592 | | | | | āāLinear: 6-43 590,592 | | | | | āāDropout: 6-44 -- | | | | | āāDropout: 6-45 -- | | | | | āāSoftmax: 6-46 -- | | | āāBlock: 4-6 -- | | | | āāLayerNorm: 5-19 1,536 | | | | āāLayerNorm: 5-20 1,536 | | | | āāMlp: 5-21 -- | | | | | āāLinear: 6-47 2,362,368 | | | | | āāLinear: 6-48 2,360,064 | | | | | āāDropout: 6-49 -- | | | | āāAttention: 5-22 -- | | | | | āāLinear: 6-50 590,592 | | | | | āāLinear: 6-51 590,592 | | | | | āāLinear: 6-52 590,592 | | | | | āāLinear: 6-53 590,592 | | | | | āāDropout: 6-54 -- | | | | | āāDropout: 6-55 -- | | | | | āāSoftmax: 6-56 -- | | | āāBlock: 4-7 -- | | | | āāLayerNorm: 5-23 1,536 | | | | āāLayerNorm: 5-24 1,536 | | | | āāMlp: 5-25 -- | | | | | āāLinear: 6-57 2,362,368 | | | | | āāLinear: 6-58 2,360,064 | | | | | āāDropout: 6-59 -- | | | | āāAttention: 5-26 -- | | | | | āāLinear: 6-60 590,592 | | | | | āāLinear: 6-61 590,592 | | | | | āāLinear: 6-62 590,592 | | | | | āāLinear: 6-63 590,592 | | | | | āāDropout: 6-64 -- | | | | | āāDropout: 6-65 -- | | | | | āāSoftmax: 6-66 -- | | | āāBlock: 4-8 -- | | | | āāLayerNorm: 5-27 1,536 | | | | āāLayerNorm: 5-28 1,536 | | | | āāMlp: 5-29 -- | | | | | āāLinear: 6-67 2,362,368 | | | | | āāLinear: 6-68 2,360,064 | | | | | āāDropout: 6-69 -- | | | | āāAttention: 5-30 -- | | | | | āāLinear: 6-70 590,592 | | | | | āāLinear: 6-71 590,592 | | | | | āāLinear: 6-72 590,592 | | | | | āāLinear: 6-73 590,592 | | | | | āāDropout: 6-74 -- | | | | | āāDropout: 6-75 -- | | | | | āāSoftmax: 6-76 -- | | | āāBlock: 4-9 -- | | | | āāLayerNorm: 5-31 1,536 | | | | āāLayerNorm: 5-32 1,536 | | | | āāMlp: 5-33 -- | | | | | āāLinear: 6-77 2,362,368 | | | | | āāLinear: 6-78 2,360,064 | | | | | āāDropout: 6-79 -- | | | | āāAttention: 5-34 -- | | | | | āāLinear: 6-80 590,592 | | | | | āāLinear: 6-81 590,592 | | | | | āāLinear: 6-82 590,592 | | | | | āāLinear: 6-83 590,592 | | | | | āāDropout: 6-84 -- | | | | | āāDropout: 6-85 -- | | | | | āāSoftmax: 6-86 -- | | | āāBlock: 4-10 -- | | | | āāLayerNorm: 5-35 1,536 | | | | āāLayerNorm: 5-36 1,536 | | | | āāMlp: 5-37 -- | | | | | āāLinear: 6-87 2,362,368 | | | | | āāLinear: 6-88 2,360,064 | | | | | āāDropout: 6-89 -- | | | | āāAttention: 5-38 -- | | | | | āāLinear: 6-90 590,592 | | | | | āāLinear: 6-91 590,592 | | | | | āāLinear: 6-92 590,592 | | | | | āāLinear: 6-93 590,592 | | | | | āāDropout: 6-94 -- | | | | | āāDropout: 6-95 -- | | | | | āāSoftmax: 6-96 -- | | | āāBlock: 4-11 -- | | | | āāLayerNorm: 5-39 1,536 | | | | āāLayerNorm: 5-40 1,536 | | | | āāMlp: 5-41 -- | | | | | āāLinear: 6-97 2,362,368 | | | | | āāLinear: 6-98 2,360,064 | | | | | āāDropout: 6-99 -- | | | | āāAttention: 5-42 -- | | | | | āāLinear: 6-100 590,592 | | | | | āāLinear: 6-101 590,592 | | | | | āāLinear: 6-102 590,592 | | | | | āāLinear: 6-103 590,592 | | | | | āāDropout: 6-104 -- | | | | | āāDropout: 6-105 -- | | | | | āāSoftmax: 6-106 -- | | | āāBlock: 4-12 -- | | | | āāLayerNorm: 5-43 1,536 | | | | āāLayerNorm: 5-44 1,536 | | | | āāMlp: 5-45 -- | | | | | āāLinear: 6-107 2,362,368 | | | | | āāLinear: 6-108 2,360,064 | | | | | āāDropout: 6-109 -- | | | | āāAttention: 5-46 -- | | | | | āāLinear: 6-110 590,592 | | | | | āāLinear: 6-111 590,592 | | | | | āāLinear: 6-112 590,592 | | | | | āāLinear: 6-113 590,592 | | | | | āāDropout: 6-114 -- | | | | | āāDropout: 6-115 -- | | | | | āāSoftmax: 6-116 -- | | | āāBlock: 4-13 -- | | | | āāLayerNorm: 5-47 1,536 | | | | āāLayerNorm: 5-48 1,536 | | | | āāMlp: 5-49 -- | | | | | āāLinear: 6-117 2,362,368 | | | | | āāLinear: 6-118 2,360,064 | | | | | āāDropout: 6-119 -- | | | | āāAttention: 5-50 -- | | | | | āāLinear: 6-120 590,592 | | | | | āāLinear: 6-121 590,592 | | | | | āāLinear: 6-122 590,592 | | | | | āāLinear: 6-123 590,592 | | | | | āāDropout: 6-124 -- | | | | | āāDropout: 6-125 -- | | | | | āāSoftmax: 6-126 -- | | | āāBlock: 4-14 -- | | | | āāLayerNorm: 5-51 1,536 | | | | āāLayerNorm: 5-52 1,536 | | | | āāMlp: 5-53 -- | | | | | āāLinear: 6-127 2,362,368 | | | | | āāLinear: 6-128 2,360,064 | | | | | āāDropout: 6-129 -- | | | | āāAttention: 5-54 -- | | | | | āāLinear: 6-130 590,592 | | | | | āāLinear: 6-131 590,592 | | | | | āāLinear: 6-132 590,592 | | | | | āāLinear: 6-133 590,592 | | | | | āāDropout: 6-134 -- | | | | | āāDropout: 6-135 -- | | | | | āāSoftmax: 6-136 -- | | āāLayerNorm: 3-5 1,536 āāDecoderCup: 1-2 -- | āāConv2dReLU: 2-3 -- | | āāConv2d: 3-6 3,538,944 | | āāBatchNorm2d: 3-7 1,024 | | āāReLU: 3-8 -- | āāModuleList: 2-4 -- | | āāDecoderBlock: 3-9 -- | | | āāConv2dReLU: 4-15 -- | | | | āāConv2d: 5-55 2,359,296 | | | | āāBatchNorm2d: 5-56 512 | | | | āāReLU: 5-57 -- | | | āāConv2dReLU: 4-16 -- | | | | āāConv2d: 5-58 589,824 | | | | āāBatchNorm2d: 5-59 512 | | | | āāReLU: 5-60 -- | | | āāUpsamplingBilinear2d: 4-17 -- | | āāDecoderBlock: 3-10 -- | | | āāConv2dReLU: 4-18 -- | | | | āāConv2d: 5-61 589,824 | | | | āāBatchNorm2d: 5-62 256 | | | | āāReLU: 5-63 -- | | | āāConv2dReLU: 4-19 -- | | | | āāConv2d: 5-64 147,456 | | | | āāBatchNorm2d: 5-65 256 | | | | āāReLU: 5-66 -- | | | āāUpsamplingBilinear2d: 4-20 -- | | āāDecoderBlock: 3-11 -- | | | āāConv2dReLU: 4-21 -- | | | | āāConv2d: 5-67 110,592 | | | | āāBatchNorm2d: 5-68 128 | | | | āāReLU: 5-69 -- | | | āāConv2dReLU: 4-22 -- | | | | āāConv2d: 5-70 36,864 | | | | āāBatchNorm2d: 5-71 128 | | | | āāReLU: 5-72 -- | | | āāUpsamplingBilinear2d: 4-23 -- | | āāDecoderBlock: 3-12 -- | | | āāConv2dReLU: 4-24 -- | | | | āāConv2d: 5-73 9,216 | | | | āāBatchNorm2d: 5-74 32 | | | | āāReLU: 5-75 -- | | | āāConv2dReLU: 4-25 -- | | | | āāConv2d: 5-76 2,304 | | | | āāBatchNorm2d: 5-77 32 | | | | āāReLU: 5-78 -- | | | āāUpsamplingBilinear2d: 4-26 -- āāSegmentationHead: 1-3 -- | āāConv2d: 2-5 290 | āāIdentity: 2-6 -- ================================================================================ Total params: 105,125,538 Trainable params: 105,125,538 Non-trainable params: 0 ================================================================================
Testing Multiple Models¶
import glob
import os
def test_all_models_in_directory(model_directory, test_args):
"""
Automatically test all .pth files in a directory
Args:
model_directory: Path to directory containing .pth files
test_args: Test arguments
"""
# Find all .pth files in the directory
model_files = glob.glob(os.path.join(model_directory, "*.pth"))
if not model_files:
print(f"No .pth files found in {model_directory}")
return {}
print(f"Found {len(model_files)} model files:")
for file in model_files:
print(f" - {os.path.basename(file)}")
all_results = {}
for model_path in model_files:
model_name = os.path.basename(model_path).replace('.pth', '')
print(f"\n{'='*50}")
print(f"Testing Model: {model_name}")
print(f"{'='*50}")
# Create model
config_vit = CONFIGS[test_args.vit_name]
config_vit.n_classes = test_args.num_classes
config_vit.n_skip = test_args.n_skip
config_vit.patches.size = (test_args.vit_patches_size, test_args.vit_patches_size)
if 'R50' in test_args.vit_name:
grid_size = int(test_args.img_size / test_args.vit_patches_size)
config_vit.patches.grid = (grid_size, grid_size)
net = VisionTransformer(config_vit, img_size=test_args.img_size,
num_classes=config_vit.n_classes).to(device)
# Load model weights
try:
print(f"Loading model from: {model_path}")
net.load_state_dict(torch.load(model_path))
# Run inference
test_results = run_inference_simple(test_args, net)
all_results[model_name] = test_results
# print Model stats
print(f"\n{'='*50}")
print(f"Results For Model: {model_name}")
dice_scores = [r['dice'] for r in test_results]
print(f"{model_name}: Mean Dice = {np.mean(dice_scores):.4f}, Std = {np.std(dice_scores):.4f}")
print(f"{model_name}: Median Dice = {np.median(dice_scores):.4f}, Std = {np.std(dice_scores):.4f}")
print(f"\n{'-'*50}")
# HD95 scores
hd95_scores = [r['hd95'] for r in test_results]
print(f"{model_name}: Mean HD95 = {np.mean(hd95_scores):.4f}, Std = {np.std(hd95_scores):.4f}")
print(f"{model_name}: Median HD95 = {np.median(hd95_scores):.4f}, Std = {np.std(hd95_scores):.4f}")
print(f"{'='*50}")
# Clean up GPU memory
del net
torch.cuda.empty_cache()
except Exception as e:
print(f"Error loading model {model_path}: {e}")
all_results[model_name] = None
return all_results
def run_inference_simple(args, model):
"""Simplified inference without saving files"""
print("Starting inference...")
# Create test dataset
db_test = GF7Dataset(
image_dir=args.image_dir,
mask_dir=args.mask_dir,
image_size=args.img_size,
transform=None
)
testloader = DataLoader(db_test, batch_size=1, shuffle=False, num_workers=0)
print(f"Testing on {len(db_test)} samples")
model.eval()
all_metrics = []
# Maximum possible distance in a 224x224 image
max_distance = np.sqrt(224**2 + 224**2) # ā 316.8
with torch.no_grad():
for i, (image_batch, label_batch) in enumerate(tqdm(testloader, desc="Testing")):
# Move to device
image_batch = image_batch.to(device).float()
label_np = label_batch.squeeze().cpu().numpy()
# Forward pass
outputs = model(image_batch)
pred_np = torch.argmax(torch.softmax(outputs, dim=1), dim=1).squeeze().cpu().numpy()
# Calculate metrics for foreground class - FIXED LOGIC
if pred_np.sum() > 0 and label_np.sum() > 0:
dice = metric.binary.dc(pred_np, label_np)
hd95 = metric.binary.hd95(pred_np, label_np)
elif pred_np.sum() > 0 and label_np.sum() == 0:
# False positives: predicted buildings where none exist
dice, hd95 = 0, max_distance
elif pred_np.sum() == 0 and label_np.sum() > 0:
# False negatives: missed buildings that should exist
dice, hd95 = 0, max_distance
else:
# Both empty - perfect agreement
dice, hd95 = 1, 0
all_metrics.append({
'filename': os.path.basename(db_test.image_paths[i]),
'dice': dice,
'hd95': hd95
})
return all_metrics
def compare_all_models(all_results):
"""Simple comparison of all model results"""
import pandas as pd
import matplotlib.pyplot as plt
comparison_data = []
for model_name, results in all_results.items():
if results is not None:
dice_scores = [r['dice'] for r in results]
hd95_scores = [r['hd95'] for r in results]
comparison_data.append({
'Model': model_name,
'Mean Dice': np.mean(dice_scores),
'Std Dice': np.std(dice_scores),
'Mean HD95': np.mean(hd95_scores),
'Std HD95': np.std(hd95_scores),
'Median Dice': np.median(dice_scores),
'Median HD95': np.median(hd95_scores),
'Max Dice': np.max(dice_scores),
'Min Dice': np.min(dice_scores)
})
# Create comparison DataFrame
comparison_df = pd.DataFrame(comparison_data)
# Sort by Mean Dice descending
comparison_df = comparison_df.sort_values('Mean Dice', ascending=False)
print("\nModel Comparison Results (Sorted by Mean Dice):")
print("="*100)
display(comparison_df)
# Plot comparison
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
# Dice Coefficient comparison
ax1.bar(range(len(comparison_df)), comparison_df['Mean Dice'],
yerr=comparison_df['Std Dice'], capsize=5)
ax1.set_title('Mean Dice Coefficient Comparison')
ax1.set_ylabel('Dice Coefficient')
ax1.set_xticks(range(len(comparison_df)))
ax1.set_xticklabels(comparison_df['Model'], rotation=45, ha='right')
ax1.grid(True, alpha=0.3)
# HD95 comparison
ax2.bar(range(len(comparison_df)), comparison_df['Mean HD95'],
yerr=comparison_df['Std HD95'], capsize=5)
ax2.set_title('Mean HD95 Comparison')
ax2.set_ylabel('HD95')
ax2.set_xticks(range(len(comparison_df)))
ax2.set_xticklabels(comparison_df['Model'], rotation=45, ha='right')
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
return comparison_df
model_directory = "model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42"
test_args = args
# Run comparison on all models in the directory
print("Testing all models in directory...")
results = test_all_models_in_directory(model_directory, test_args)
comparison = compare_all_models(results)
# Show the best performing model
if not comparison.empty:
best_model = comparison.iloc[0]
print(f"\nBest performing model: {best_model['Model']}")
print(f"Mean Dice: {best_model['Mean Dice']:.4f}")
print(f"Mean HD95: {best_model['Mean HD95']:.4f}")
Testing all models in directory... Found 31 model files: - epoch_104_iter_13125.pth - epoch_109_iter_13750.pth - epoch_114_iter_14375.pth - epoch_119_iter_15000.pth - epoch_124_iter_15625.pth - epoch_129_iter_16250.pth - epoch_134_iter_16875.pth - epoch_139_iter_17500.pth - epoch_144_iter_18125.pth - epoch_149.pth - epoch_149_iter_18750.pth - epoch_154_iter_19375.pth - epoch_159_iter_20000.pth - epoch_162.pth - epoch_49_iter_6250.pth - epoch_54_iter_6875.pth - epoch_59_iter_7500.pth - epoch_64_iter_8125.pth - epoch_69_iter_8750.pth - epoch_74_iter_9375.pth - epoch_79_iter_10000.pth - epoch_84_iter_10625.pth - epoch_89_iter_11250.pth - epoch_94_iter_11875.pth - epoch_99.pth - epoch_99_iter_12500.pth - LOW_CE_epoch_113_iter_14250_loss_0.0575.pth - LOW_CE_epoch_121_iter_15250_loss_0.0375.pth - LOW_CE_epoch_126_iter_15875_loss_0.0559.pth - LOW_CE_epoch_129_iter_16250_loss_0.0573.pth - LOW_CE_epoch_93_iter_11750_loss_0.0386.pth ================================================== Testing Model: epoch_104_iter_13125 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_104_iter_13125.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_104_iter_13125 epoch_104_iter_13125: Mean Dice = 0.7337, Std = 0.2381 epoch_104_iter_13125: Median Dice = 0.8139, Std = 0.2381 -------------------------------------------------- epoch_104_iter_13125: Mean HD95 = 35.5836, Std = 73.3899 epoch_104_iter_13125: Median HD95 = 8.5440, Std = 73.3899 ================================================== ================================================== Testing Model: epoch_109_iter_13750 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_109_iter_13750.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_109_iter_13750 epoch_109_iter_13750: Mean Dice = 0.7413, Std = 0.2377 epoch_109_iter_13750: Median Dice = 0.8175, Std = 0.2377 -------------------------------------------------- epoch_109_iter_13750: Mean HD95 = 31.0773, Std = 66.8304 epoch_109_iter_13750: Median HD95 = 8.2462, Std = 66.8304 ================================================== ================================================== Testing Model: epoch_114_iter_14375 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_114_iter_14375.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_114_iter_14375 epoch_114_iter_14375: Mean Dice = 0.7392, Std = 0.2357 epoch_114_iter_14375: Median Dice = 0.8130, Std = 0.2357 -------------------------------------------------- epoch_114_iter_14375: Mean HD95 = 32.2114, Std = 70.2989 epoch_114_iter_14375: Median HD95 = 8.0623, Std = 70.2989 ================================================== ================================================== Testing Model: epoch_119_iter_15000 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_119_iter_15000.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_119_iter_15000 epoch_119_iter_15000: Mean Dice = 0.7458, Std = 0.2305 epoch_119_iter_15000: Median Dice = 0.8163, Std = 0.2305 -------------------------------------------------- epoch_119_iter_15000: Mean HD95 = 29.6504, Std = 64.4507 epoch_119_iter_15000: Median HD95 = 8.0623, Std = 64.4507 ================================================== ================================================== Testing Model: epoch_124_iter_15625 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_124_iter_15625.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_124_iter_15625 epoch_124_iter_15625: Mean Dice = 0.7409, Std = 0.2326 epoch_124_iter_15625: Median Dice = 0.8145, Std = 0.2326 -------------------------------------------------- epoch_124_iter_15625: Mean HD95 = 29.9770, Std = 65.0437 epoch_124_iter_15625: Median HD95 = 8.1542, Std = 65.0437 ================================================== ================================================== Testing Model: epoch_129_iter_16250 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_129_iter_16250.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_129_iter_16250 epoch_129_iter_16250: Mean Dice = 0.7471, Std = 0.2319 epoch_129_iter_16250: Median Dice = 0.8187, Std = 0.2319 -------------------------------------------------- epoch_129_iter_16250: Mean HD95 = 29.8864, Std = 65.3258 epoch_129_iter_16250: Median HD95 = 8.2462, Std = 65.3258 ================================================== ================================================== Testing Model: epoch_134_iter_16875 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_134_iter_16875.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_134_iter_16875 epoch_134_iter_16875: Mean Dice = 0.7357, Std = 0.2404 epoch_134_iter_16875: Median Dice = 0.8131, Std = 0.2404 -------------------------------------------------- epoch_134_iter_16875: Mean HD95 = 34.9117, Std = 71.7900 epoch_134_iter_16875: Median HD95 = 9.0471, Std = 71.7900 ================================================== ================================================== Testing Model: epoch_139_iter_17500 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_139_iter_17500.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_139_iter_17500 epoch_139_iter_17500: Mean Dice = 0.7430, Std = 0.2339 epoch_139_iter_17500: Median Dice = 0.8163, Std = 0.2339 -------------------------------------------------- epoch_139_iter_17500: Mean HD95 = 30.1785, Std = 65.5932 epoch_139_iter_17500: Median HD95 = 8.0623, Std = 65.5932 ================================================== ================================================== Testing Model: epoch_144_iter_18125 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_144_iter_18125.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_144_iter_18125 epoch_144_iter_18125: Mean Dice = 0.7346, Std = 0.2374 epoch_144_iter_18125: Median Dice = 0.8107, Std = 0.2374 -------------------------------------------------- epoch_144_iter_18125: Mean HD95 = 35.9121, Std = 72.4475 epoch_144_iter_18125: Median HD95 = 9.1046, Std = 72.4475 ================================================== ================================================== Testing Model: epoch_149 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_149.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_149 epoch_149: Mean Dice = 0.7357, Std = 0.2376 epoch_149: Median Dice = 0.8121, Std = 0.2376 -------------------------------------------------- epoch_149: Mean HD95 = 31.8905, Std = 68.2319 epoch_149: Median HD95 = 8.1818, Std = 68.2319 ================================================== ================================================== Testing Model: epoch_149_iter_18750 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_149_iter_18750.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_149_iter_18750 epoch_149_iter_18750: Mean Dice = 0.7357, Std = 0.2376 epoch_149_iter_18750: Median Dice = 0.8121, Std = 0.2376 -------------------------------------------------- epoch_149_iter_18750: Mean HD95 = 31.8905, Std = 68.2319 epoch_149_iter_18750: Median HD95 = 8.1818, Std = 68.2319 ================================================== ================================================== Testing Model: epoch_154_iter_19375 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_154_iter_19375.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_154_iter_19375 epoch_154_iter_19375: Mean Dice = 0.7325, Std = 0.2393 epoch_154_iter_19375: Median Dice = 0.8096, Std = 0.2393 -------------------------------------------------- epoch_154_iter_19375: Mean HD95 = 33.7186, Std = 70.6683 epoch_154_iter_19375: Median HD95 = 8.9443, Std = 70.6683 ================================================== ================================================== Testing Model: epoch_159_iter_20000 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_159_iter_20000.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_159_iter_20000 epoch_159_iter_20000: Mean Dice = 0.7217, Std = 0.2436 epoch_159_iter_20000: Median Dice = 0.8054, Std = 0.2436 -------------------------------------------------- epoch_159_iter_20000: Mean HD95 = 36.3708, Std = 70.5918 epoch_159_iter_20000: Median HD95 = 10.0249, Std = 70.5918 ================================================== ================================================== Testing Model: epoch_162 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_162.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_162 epoch_162: Mean Dice = 0.7305, Std = 0.2378 epoch_162: Median Dice = 0.8091, Std = 0.2378 -------------------------------------------------- epoch_162: Mean HD95 = 34.5533, Std = 72.6905 epoch_162: Median HD95 = 9.2113, Std = 72.6905 ================================================== ================================================== Testing Model: epoch_49_iter_6250 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_49_iter_6250.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_49_iter_6250 epoch_49_iter_6250: Mean Dice = 0.7307, Std = 0.2414 epoch_49_iter_6250: Median Dice = 0.8051, Std = 0.2414 -------------------------------------------------- epoch_49_iter_6250: Mean HD95 = 30.9183, Std = 65.5214 epoch_49_iter_6250: Median HD95 = 8.7441, Std = 65.5214 ================================================== ================================================== Testing Model: epoch_54_iter_6875 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_54_iter_6875.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_54_iter_6875 epoch_54_iter_6875: Mean Dice = 0.7174, Std = 0.2418 epoch_54_iter_6875: Median Dice = 0.7931, Std = 0.2418 -------------------------------------------------- epoch_54_iter_6875: Mean HD95 = 35.9845, Std = 72.3425 epoch_54_iter_6875: Median HD95 = 10.0000, Std = 72.3425 ================================================== ================================================== Testing Model: epoch_59_iter_7500 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_59_iter_7500.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_59_iter_7500 epoch_59_iter_7500: Mean Dice = 0.7290, Std = 0.2359 epoch_59_iter_7500: Median Dice = 0.7995, Std = 0.2359 -------------------------------------------------- epoch_59_iter_7500: Mean HD95 = 30.5274, Std = 64.7945 epoch_59_iter_7500: Median HD95 = 8.6023, Std = 64.7945 ================================================== ================================================== Testing Model: epoch_64_iter_8125 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_64_iter_8125.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_64_iter_8125 epoch_64_iter_8125: Mean Dice = 0.7240, Std = 0.2425 epoch_64_iter_8125: Median Dice = 0.8013, Std = 0.2425 -------------------------------------------------- epoch_64_iter_8125: Mean HD95 = 34.0232, Std = 70.3870 epoch_64_iter_8125: Median HD95 = 8.9443, Std = 70.3870 ================================================== ================================================== Testing Model: epoch_69_iter_8750 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_69_iter_8750.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_69_iter_8750 epoch_69_iter_8750: Mean Dice = 0.7060, Std = 0.2523 epoch_69_iter_8750: Median Dice = 0.7933, Std = 0.2523 -------------------------------------------------- epoch_69_iter_8750: Mean HD95 = 40.1026, Std = 74.9968 epoch_69_iter_8750: Median HD95 = 10.0573, Std = 74.9968 ================================================== ================================================== Testing Model: epoch_74_iter_9375 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_74_iter_9375.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_74_iter_9375 epoch_74_iter_9375: Mean Dice = 0.7401, Std = 0.2325 epoch_74_iter_9375: Median Dice = 0.8104, Std = 0.2325 -------------------------------------------------- epoch_74_iter_9375: Mean HD95 = 28.6624, Std = 61.1903 epoch_74_iter_9375: Median HD95 = 8.4853, Std = 61.1903 ================================================== ================================================== Testing Model: epoch_79_iter_10000 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_79_iter_10000.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_79_iter_10000 epoch_79_iter_10000: Mean Dice = 0.7333, Std = 0.2400 epoch_79_iter_10000: Median Dice = 0.8127, Std = 0.2400 -------------------------------------------------- epoch_79_iter_10000: Mean HD95 = 33.0936, Std = 70.4477 epoch_79_iter_10000: Median HD95 = 8.6023, Std = 70.4477 ================================================== ================================================== Testing Model: epoch_84_iter_10625 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_84_iter_10625.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_84_iter_10625 epoch_84_iter_10625: Mean Dice = 0.7430, Std = 0.2300 epoch_84_iter_10625: Median Dice = 0.8127, Std = 0.2300 -------------------------------------------------- epoch_84_iter_10625: Mean HD95 = 28.8617, Std = 63.3359 epoch_84_iter_10625: Median HD95 = 8.0623, Std = 63.3359 ================================================== ================================================== Testing Model: epoch_89_iter_11250 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_89_iter_11250.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_89_iter_11250 epoch_89_iter_11250: Mean Dice = 0.7398, Std = 0.2333 epoch_89_iter_11250: Median Dice = 0.8101, Std = 0.2333 -------------------------------------------------- epoch_89_iter_11250: Mean HD95 = 32.3396, Std = 67.1093 epoch_89_iter_11250: Median HD95 = 9.0000, Std = 67.1093 ================================================== ================================================== Testing Model: epoch_94_iter_11875 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_94_iter_11875.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_94_iter_11875 epoch_94_iter_11875: Mean Dice = 0.7516, Std = 0.2207 epoch_94_iter_11875: Median Dice = 0.8153, Std = 0.2207 -------------------------------------------------- epoch_94_iter_11875: Mean HD95 = 27.4172, Std = 61.2378 epoch_94_iter_11875: Median HD95 = 8.0000, Std = 61.2378 ================================================== ================================================== Testing Model: epoch_99 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_99.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_99 epoch_99: Mean Dice = 0.7294, Std = 0.2457 epoch_99: Median Dice = 0.8088, Std = 0.2457 -------------------------------------------------- epoch_99: Mean HD95 = 30.0303, Std = 63.4618 epoch_99: Median HD95 = 8.6023, Std = 63.4618 ================================================== ================================================== Testing Model: epoch_99_iter_12500 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\epoch_99_iter_12500.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: epoch_99_iter_12500 epoch_99_iter_12500: Mean Dice = 0.7294, Std = 0.2457 epoch_99_iter_12500: Median Dice = 0.8088, Std = 0.2457 -------------------------------------------------- epoch_99_iter_12500: Mean HD95 = 30.0303, Std = 63.4618 epoch_99_iter_12500: Median HD95 = 8.6023, Std = 63.4618 ================================================== ================================================== Testing Model: LOW_CE_epoch_113_iter_14250_loss_0.0575 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_113_iter_14250_loss_0.0575.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: LOW_CE_epoch_113_iter_14250_loss_0.0575 LOW_CE_epoch_113_iter_14250_loss_0.0575: Mean Dice = 0.7397, Std = 0.2332 LOW_CE_epoch_113_iter_14250_loss_0.0575: Median Dice = 0.8102, Std = 0.2332 -------------------------------------------------- LOW_CE_epoch_113_iter_14250_loss_0.0575: Mean HD95 = 29.9982, Std = 64.7857 LOW_CE_epoch_113_iter_14250_loss_0.0575: Median HD95 = 8.0623, Std = 64.7857 ================================================== ================================================== Testing Model: LOW_CE_epoch_121_iter_15250_loss_0.0375 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_121_iter_15250_loss_0.0375.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: LOW_CE_epoch_121_iter_15250_loss_0.0375 LOW_CE_epoch_121_iter_15250_loss_0.0375: Mean Dice = 0.7313, Std = 0.2344 LOW_CE_epoch_121_iter_15250_loss_0.0375: Median Dice = 0.8044, Std = 0.2344 -------------------------------------------------- LOW_CE_epoch_121_iter_15250_loss_0.0375: Mean HD95 = 31.8384, Std = 67.1647 LOW_CE_epoch_121_iter_15250_loss_0.0375: Median HD95 = 8.6023, Std = 67.1647 ================================================== ================================================== Testing Model: LOW_CE_epoch_126_iter_15875_loss_0.0559 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_126_iter_15875_loss_0.0559.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: LOW_CE_epoch_126_iter_15875_loss_0.0559 LOW_CE_epoch_126_iter_15875_loss_0.0559: Mean Dice = 0.7433, Std = 0.2307 LOW_CE_epoch_126_iter_15875_loss_0.0559: Median Dice = 0.8117, Std = 0.2307 -------------------------------------------------- LOW_CE_epoch_126_iter_15875_loss_0.0559: Mean HD95 = 31.7125, Std = 67.2290 LOW_CE_epoch_126_iter_15875_loss_0.0559: Median HD95 = 8.5440, Std = 67.2290 ================================================== ================================================== Testing Model: LOW_CE_epoch_129_iter_16250_loss_0.0573 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_129_iter_16250_loss_0.0573.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: LOW_CE_epoch_129_iter_16250_loss_0.0573 LOW_CE_epoch_129_iter_16250_loss_0.0573: Mean Dice = 0.7471, Std = 0.2319 LOW_CE_epoch_129_iter_16250_loss_0.0573: Median Dice = 0.8187, Std = 0.2319 -------------------------------------------------- LOW_CE_epoch_129_iter_16250_loss_0.0573: Mean HD95 = 29.8864, Std = 65.3258 LOW_CE_epoch_129_iter_16250_loss_0.0573: Median HD95 = 8.2462, Std = 65.3258 ================================================== ================================================== Testing Model: LOW_CE_epoch_93_iter_11750_loss_0.0386 ================================================== Loading model from: model/TU_GF7224/TU_pretrain_R50-ViT-B_16_skip3_epo163_bs25_lr0.001_224_s42\LOW_CE_epoch_93_iter_11750_loss_0.0386.pth Starting inference... Testing on 1035 samples
Testing: 0%| | 0/1035 [00:00<?, ?it/s]
================================================== Results For Model: LOW_CE_epoch_93_iter_11750_loss_0.0386 LOW_CE_epoch_93_iter_11750_loss_0.0386: Mean Dice = 0.7343, Std = 0.2378 LOW_CE_epoch_93_iter_11750_loss_0.0386: Median Dice = 0.8104, Std = 0.2378 -------------------------------------------------- LOW_CE_epoch_93_iter_11750_loss_0.0386: Mean HD95 = 29.5207, Std = 64.2300 LOW_CE_epoch_93_iter_11750_loss_0.0386: Median HD95 = 8.2462, Std = 64.2300 ================================================== Model Comparison Results (Sorted by Mean Dice): ====================================================================================================
| Model | Mean Dice | Std Dice | Mean HD95 | Std HD95 | Median Dice | Median HD95 | Max Dice | Min Dice | |
|---|---|---|---|---|---|---|---|---|---|
| 23 | epoch_94_iter_11875 | 0.751649 | 0.220656 | 27.417227 | 61.237768 | 0.815338 | 8.000000 | 1.0 | 0.0 |
| 5 | epoch_129_iter_16250 | 0.747051 | 0.231935 | 29.886420 | 65.325750 | 0.818664 | 8.246211 | 1.0 | 0.0 |
| 29 | LOW_CE_epoch_129_iter_16250_loss_0.0573 | 0.747051 | 0.231935 | 29.886420 | 65.325750 | 0.818664 | 8.246211 | 1.0 | 0.0 |
| 3 | epoch_119_iter_15000 | 0.745775 | 0.230493 | 29.650399 | 64.450662 | 0.816257 | 8.062258 | 1.0 | 0.0 |
| 28 | LOW_CE_epoch_126_iter_15875_loss_0.0559 | 0.743278 | 0.230691 | 31.712509 | 67.229009 | 0.811669 | 8.544004 | 1.0 | 0.0 |
| 7 | epoch_139_iter_17500 | 0.742970 | 0.233910 | 30.178537 | 65.593173 | 0.816336 | 8.062258 | 1.0 | 0.0 |
| 21 | epoch_84_iter_10625 | 0.742963 | 0.229957 | 28.861673 | 63.335946 | 0.812687 | 8.062258 | 1.0 | 0.0 |
| 1 | epoch_109_iter_13750 | 0.741331 | 0.237707 | 31.077298 | 66.830365 | 0.817526 | 8.246211 | 1.0 | 0.0 |
| 4 | epoch_124_iter_15625 | 0.740945 | 0.232636 | 29.977040 | 65.043682 | 0.814515 | 8.154234 | 1.0 | 0.0 |
| 19 | epoch_74_iter_9375 | 0.740073 | 0.232473 | 28.662412 | 61.190262 | 0.810391 | 8.485281 | 1.0 | 0.0 |
| 22 | epoch_89_iter_11250 | 0.739835 | 0.233307 | 32.339582 | 67.109268 | 0.810083 | 9.000000 | 1.0 | 0.0 |
| 26 | LOW_CE_epoch_113_iter_14250_loss_0.0575 | 0.739662 | 0.233171 | 29.998160 | 64.785662 | 0.810197 | 8.062258 | 1.0 | 0.0 |
| 2 | epoch_114_iter_14375 | 0.739219 | 0.235722 | 32.211425 | 70.298897 | 0.813012 | 8.062258 | 1.0 | 0.0 |
| 9 | epoch_149 | 0.735745 | 0.237571 | 31.890521 | 68.231875 | 0.812070 | 8.181828 | 1.0 | 0.0 |
| 10 | epoch_149_iter_18750 | 0.735745 | 0.237571 | 31.890521 | 68.231875 | 0.812070 | 8.181828 | 1.0 | 0.0 |
| 6 | epoch_134_iter_16875 | 0.735659 | 0.240446 | 34.911670 | 71.790029 | 0.813118 | 9.047077 | 1.0 | 0.0 |
| 8 | epoch_144_iter_18125 | 0.734607 | 0.237366 | 35.912058 | 72.447460 | 0.810662 | 9.104633 | 1.0 | 0.0 |
| 30 | LOW_CE_epoch_93_iter_11750_loss_0.0386 | 0.734310 | 0.237822 | 29.520738 | 64.229981 | 0.810378 | 8.246211 | 1.0 | 0.0 |
| 0 | epoch_104_iter_13125 | 0.733686 | 0.238100 | 35.583605 | 73.389912 | 0.813859 | 8.544004 | 1.0 | 0.0 |
| 20 | epoch_79_iter_10000 | 0.733316 | 0.239987 | 33.093608 | 70.447702 | 0.812729 | 8.602325 | 1.0 | 0.0 |
| 11 | epoch_154_iter_19375 | 0.732486 | 0.239264 | 33.718607 | 70.668271 | 0.809594 | 8.944272 | 1.0 | 0.0 |
| 27 | LOW_CE_epoch_121_iter_15250_loss_0.0375 | 0.731289 | 0.234403 | 31.838361 | 67.164692 | 0.804395 | 8.602325 | 1.0 | 0.0 |
| 14 | epoch_49_iter_6250 | 0.730672 | 0.241366 | 30.918297 | 65.521412 | 0.805051 | 8.744138 | 1.0 | 0.0 |
| 13 | epoch_162 | 0.730517 | 0.237756 | 34.553314 | 72.690472 | 0.809128 | 9.211336 | 1.0 | 0.0 |
| 24 | epoch_99 | 0.729376 | 0.245724 | 30.030342 | 63.461827 | 0.808831 | 8.602325 | 1.0 | 0.0 |
| 25 | epoch_99_iter_12500 | 0.729376 | 0.245724 | 30.030342 | 63.461827 | 0.808831 | 8.602325 | 1.0 | 0.0 |
| 16 | epoch_59_iter_7500 | 0.729024 | 0.235870 | 30.527404 | 64.794543 | 0.799470 | 8.602325 | 1.0 | 0.0 |
| 17 | epoch_64_iter_8125 | 0.723984 | 0.242519 | 34.023163 | 70.387038 | 0.801342 | 8.944272 | 1.0 | 0.0 |
| 12 | epoch_159_iter_20000 | 0.721729 | 0.243553 | 36.370824 | 70.591781 | 0.805362 | 10.024938 | 1.0 | 0.0 |
| 15 | epoch_54_iter_6875 | 0.717371 | 0.241837 | 35.984473 | 72.342546 | 0.793067 | 10.000000 | 1.0 | 0.0 |
| 18 | epoch_69_iter_8750 | 0.706041 | 0.252250 | 40.102620 | 74.996808 | 0.793301 | 10.057284 | 1.0 | 0.0 |
Best performing model: epoch_94_iter_11875 Mean Dice: 0.7516 Mean HD95: 27.4172
comparison.round(2) # Round the comparison DataFrame for better readability
| Model | Mean Dice | Std Dice | Mean HD95 | Std HD95 | Median Dice | Median HD95 | Max Dice | Min Dice | |
|---|---|---|---|---|---|---|---|---|---|
| 23 | epoch_94_iter_11875 | 0.75 | 0.22 | 27.42 | 61.24 | 0.82 | 8.00 | 1.0 | 0.0 |
| 5 | epoch_129_iter_16250 | 0.75 | 0.23 | 29.89 | 65.33 | 0.82 | 8.25 | 1.0 | 0.0 |
| 29 | LOW_CE_epoch_129_iter_16250_loss_0.0573 | 0.75 | 0.23 | 29.89 | 65.33 | 0.82 | 8.25 | 1.0 | 0.0 |
| 3 | epoch_119_iter_15000 | 0.75 | 0.23 | 29.65 | 64.45 | 0.82 | 8.06 | 1.0 | 0.0 |
| 28 | LOW_CE_epoch_126_iter_15875_loss_0.0559 | 0.74 | 0.23 | 31.71 | 67.23 | 0.81 | 8.54 | 1.0 | 0.0 |
| 7 | epoch_139_iter_17500 | 0.74 | 0.23 | 30.18 | 65.59 | 0.82 | 8.06 | 1.0 | 0.0 |
| 21 | epoch_84_iter_10625 | 0.74 | 0.23 | 28.86 | 63.34 | 0.81 | 8.06 | 1.0 | 0.0 |
| 1 | epoch_109_iter_13750 | 0.74 | 0.24 | 31.08 | 66.83 | 0.82 | 8.25 | 1.0 | 0.0 |
| 4 | epoch_124_iter_15625 | 0.74 | 0.23 | 29.98 | 65.04 | 0.81 | 8.15 | 1.0 | 0.0 |
| 19 | epoch_74_iter_9375 | 0.74 | 0.23 | 28.66 | 61.19 | 0.81 | 8.49 | 1.0 | 0.0 |
| 22 | epoch_89_iter_11250 | 0.74 | 0.23 | 32.34 | 67.11 | 0.81 | 9.00 | 1.0 | 0.0 |
| 26 | LOW_CE_epoch_113_iter_14250_loss_0.0575 | 0.74 | 0.23 | 30.00 | 64.79 | 0.81 | 8.06 | 1.0 | 0.0 |
| 2 | epoch_114_iter_14375 | 0.74 | 0.24 | 32.21 | 70.30 | 0.81 | 8.06 | 1.0 | 0.0 |
| 9 | epoch_149 | 0.74 | 0.24 | 31.89 | 68.23 | 0.81 | 8.18 | 1.0 | 0.0 |
| 10 | epoch_149_iter_18750 | 0.74 | 0.24 | 31.89 | 68.23 | 0.81 | 8.18 | 1.0 | 0.0 |
| 6 | epoch_134_iter_16875 | 0.74 | 0.24 | 34.91 | 71.79 | 0.81 | 9.05 | 1.0 | 0.0 |
| 8 | epoch_144_iter_18125 | 0.73 | 0.24 | 35.91 | 72.45 | 0.81 | 9.10 | 1.0 | 0.0 |
| 30 | LOW_CE_epoch_93_iter_11750_loss_0.0386 | 0.73 | 0.24 | 29.52 | 64.23 | 0.81 | 8.25 | 1.0 | 0.0 |
| 0 | epoch_104_iter_13125 | 0.73 | 0.24 | 35.58 | 73.39 | 0.81 | 8.54 | 1.0 | 0.0 |
| 20 | epoch_79_iter_10000 | 0.73 | 0.24 | 33.09 | 70.45 | 0.81 | 8.60 | 1.0 | 0.0 |
| 11 | epoch_154_iter_19375 | 0.73 | 0.24 | 33.72 | 70.67 | 0.81 | 8.94 | 1.0 | 0.0 |
| 27 | LOW_CE_epoch_121_iter_15250_loss_0.0375 | 0.73 | 0.23 | 31.84 | 67.16 | 0.80 | 8.60 | 1.0 | 0.0 |
| 14 | epoch_49_iter_6250 | 0.73 | 0.24 | 30.92 | 65.52 | 0.81 | 8.74 | 1.0 | 0.0 |
| 13 | epoch_162 | 0.73 | 0.24 | 34.55 | 72.69 | 0.81 | 9.21 | 1.0 | 0.0 |
| 24 | epoch_99 | 0.73 | 0.25 | 30.03 | 63.46 | 0.81 | 8.60 | 1.0 | 0.0 |
| 25 | epoch_99_iter_12500 | 0.73 | 0.25 | 30.03 | 63.46 | 0.81 | 8.60 | 1.0 | 0.0 |
| 16 | epoch_59_iter_7500 | 0.73 | 0.24 | 30.53 | 64.79 | 0.80 | 8.60 | 1.0 | 0.0 |
| 17 | epoch_64_iter_8125 | 0.72 | 0.24 | 34.02 | 70.39 | 0.80 | 8.94 | 1.0 | 0.0 |
| 12 | epoch_159_iter_20000 | 0.72 | 0.24 | 36.37 | 70.59 | 0.81 | 10.02 | 1.0 | 0.0 |
| 15 | epoch_54_iter_6875 | 0.72 | 0.24 | 35.98 | 72.34 | 0.79 | 10.00 | 1.0 | 0.0 |
| 18 | epoch_69_iter_8750 | 0.71 | 0.25 | 40.10 | 75.00 | 0.79 | 10.06 | 1.0 | 0.0 |
.jpg)